This is the mail archive of the
mailing list for the GCC project.
Re: [i386] Scalar DImode instructions on XMM registers
- From: Richard Henderson <rth at redhat dot com>
- To: Jan Hubicka <hubicka at ucw dot cz>, Ilya Enkovich <enkovich dot gnu at gmail dot com>
- Cc: GCC Development <gcc at gcc dot gnu dot org>, Uros Bizjak <ubizjak at gmail dot com>, vmakarov at redhat dot com
- Date: Thu, 07 May 2015 11:22:16 -0700
- Subject: Re: [i386] Scalar DImode instructions on XMM registers
- Authentication-results: sourceware.org; auth=none
- References: <CAMbmDYYT6zE86-xAYs08VV2nWDK6Np+qEYoj+6oGM276MtBuPQ at mail dot gmail dot com> <CAFULd4YVruAT=RHgENhBcuKZgE6FvRa=8aR6WygKm9F4GjnJyg at mail dot gmail dot com> <CAFULd4aycTg3bYKx7c9GXpgiY4WeqmLh1f5HFYL6K+K35QmTWA at mail dot gmail dot com> <CAMbmDYaDrCnDCnQfP0toV87pi_mE_pbPCP6M-FEkGNDAtWKFUA at mail dot gmail dot com> <CAFULd4amXWDT45oUNqi2cLL2Tec-kMJm7Kz301myZSWZw-3H7Q at mail dot gmail dot com> <alpine dot DEB dot 2 dot 11 dot 1504241222020 dot 1687 at laptop-mg dot saclay dot inria dot fr> <CAMbmDYYfq-RVYa0MwrGH_DpnV7psPHKZpxaouMuq_nsOPeO_ug at mail dot gmail dot com> <20150425013239 dot GB719 at atrey dot karlin dot mff dot cuni dot cz> <554B91A8 dot 5090100 at redhat dot com>
On 05/07/2015 09:24 AM, Richard Henderson wrote:
> I was wondering this morning about the possibility of a kind of constraint that
> would allow RA to generate pairs of registers via CONCAT. That is, the two
> hard registers within the CONCAT are collectively the double-word allocation,
> but need not be sequential like current multi-word allocations. A target using
> such a constraint is promising to handle the CONCAT either by splitting (and
> gen_lowpart et al), or print_operand letters (e.g. the m68k %R, for outputting
> the low part of a pair).
> With that, we get the best of both -- lower-subreg effectively happening in RA,
> and DImode arithmetic in SSE no subregs required.
I forgot one issue that lower-subreg also cures -- describing the lifetime of
the pair of registers. We wouldn't get that with a single bit saying that
CONCAT is ok.
di100 = di101 + di102
(flags, si200) = si201 + si202
si300 = si301 + si302 + carry(flags)
If we split prior to RA, we can see that si200 cannot overlap si301 or si302.
If we split after RA, we have to handle this ourselves in the backend, leading
to additional matching-constraint alternatives and/or early-clobbers.
We'd need a couple of bits: one saying that concat is ok, the other saying
whether all lows are consumed before all highs, when allocating a set of
CONCATs across all of the operands.
Or perhaps we don't need such a bit and we merely include "high inputs not
clobbered by low output" as part of the contract with RA.