This is the mail archive of the
mailing list for the GCC project.
Re: [rfc] multi-word subreg lowering pass
- From: Ian Lance Taylor <ian at airs dot com>
- To: Roger Sayle <roger at eyesopen dot com>
- Cc: BjÃrn Haase <bjoern dot m dot haase at web dot de>, gcc-patches at gcc dot gnu dot org, Richard Henderson <rth at redhat dot com>
- Date: 01 Jun 2006 11:26:01 -0700
- Subject: Re: [rfc] multi-word subreg lowering pass
- References: <Pine.LNX.firstname.lastname@example.org>
Roger Sayle <email@example.com> writes:
> I just thought I'd mention that I've been investigating variants
> of RTH's subreg lowering patch myself. However, rather than perform
> this as a late RTL pass, I've been trying out variants that do the
> lowering at RTL expansion time. The mechanism is for the backend to
> specify CONCAT_MODE_P(mode) for integer modes, meaning that it would
> like operations in this mode to be represented as operations on a
> pair of registers in a narrower mode. This then resues the existing
> expansion-time CONCAT mechanism for tracking pseudo pairs, and debug
I've also been working on this. I'm currently running a version of
RTH's lower-subreg pass twice, once early and once after splitting
instructions. Then I've added splits for, e.g., adddi3. This gives
me good results, including results like the ones you showed. In order
to improve real benchmarks, though, I still have some more work to do,
specifically to enhance the register allocators to understand that
DImode copies represented as two SImode copies do not indicate a
register conflict (essentially, an implementation of REG_NO_CONFLICT
which works for the global register allocator and doesn't actually
require the REG_NO_CONFLICT notes).
I haven't sent any of this work out because my employer does not yet
have a copyright assignment.
When using CONCAT at expansion time, I'd be worried about losing RTL
level optimizations on DImode values, as you suggested at the end of
your message. I don't know how valid that concern is--I don't know
how much we can expect the tree level code to handle.
The approach I'm using is neutral with regard to targets with two
register sizes. You can control what happens by writing appropriate
insns and splits. It doesn't help the case of deciding when it's
better to use a more expensive 64-bit register.