This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [trunk] Addition to subreg section of rtl.text.


On Thu, Mar 20, 2008 at 10:39:47AM +0000, Richard Sandiford wrote:
> you're saying that, for any valid values of M and X:
> 
>   (set (subreg:M (reg:N ...) X) (const_int 0))
> 
> does not guarantee that (subreg:M (reg:N ...) ...) has the value 0
> if N is a partial mode?

Yes.  Although it will be more common for 1 bits to change to zero, some bits
might actually differ between successive reads, when these bits are status
flags.  AFAICR there was even some processor that had a level-sensitive
I/O port bit mapped in its status register.

> > And @code{(subreg:SI (reg:DF 10) 0)} would be a natural way to express that
> > you are using the floating point register as a 32 bit integer register,
> > with writes clobbering the entire 64 bit of the register.
> 
> Yes, this is one possible definition.  But there's no reason in this
> situation why you couldn't just use a single REG.  Why use subregs at all?

Because before reload, you use pseudos.  And in order for
(subreg:SI (reg:DF ...) ...) to be viable, it still has to be viable between
hard register allocation and alter_reg.

> I thought in the earlier post, you were suggesting that it should be
> OK to represent a doubleword register that has individually-addressable
> words as a single register if most accesses were of the doubleword variety.
> I thought you were then saying that you could use (subreg:SI (reg:DF ...) ...)
> to refer to the individually-addressable parts.  In that scenario you
> _wouldn't_ want (subreg:SI (reg:DF 10) 0) to clobber the whole register.

Well, yes, but then it would be the exception, so you
could use strict_low_part.  Or zero_extract if we say that's OK.

> >> My understanding was that nested subregs aren't allowed (any more).
> >
> > That's why I taked about spreading it across multiple instruction patterns.
> > Unfortunately that can leave you with multiple machine instructions
> > where one would do, just because the middle-end is in denial that these
> > things might exist.
> 
> It just seems to me that, by the time you get to the stage of having
> multiple instructions for a single write, you've lost any advantage
> you've gained by avoiding unspecs.

You will at times see the various SUBREGs be generated from C code.
The restrictions on nested subregs and MODE_CLASS matching will mean
that the instructions using these SUBREGs cannot be combined by cse or
combine.

> > I think some of the rules are overly restrictive, and prevent gcc
> > from archiving its full potential for generating efficient code.
> > Moreover, if a port has an extv / insv pattern that matches in mode with the
> > wide registers, it can legitimately use the zero_extract route.  It's
> > reload that contradicts the documentation in changing registers into MEMs
> > and thus creating zero_extracts from wide MEMs.
> 
> It sounds like you might be referring to both the subreg and extract
> documentation here.  As far as the subreg documentation goes,
> let's assume that what I said above about partial modes is right
> (you'll have already corrected me by now if not).  If we change the
> rules to say that, what do you think is still overly restrictive?

- zero_extract officially only allowed for a specific mode.
- nested subregs not allowed, but neither are all subregs that
  would result from substituting a subreg into an inner reg of another
  subreg and simplifying allowed.
- highpart subregs not allowed (e.g. consider SH64 floating point registers:
  word_mode is 64, but the floating point registers are 32 bit.  How do
  you refer to the high part of a DFmode value, considering that the
  inner reg might be allocated to a floating value.  (Actually, generally
  want such an allocation).)
> A specific edit to our rtl.texi proposal would probaby be helpful
> at this stage.

Sorry, have to wait for Copyright Assignment :-(

> 
> E.g. one possibility would be to drop:
> 
>     If @var{reg} is a hard register, the @code{subreg} must also represent
>     the lowpart of a particular hard register, or represent one or more
>     complete hard registers.
> 
> and instead say that the word-based semantics for pseudo registers also
> apply to hard registers, regardless of the number of hard registers in
> the inner register.  This would in some ways be simpler.

Yes.  If subword-writing semantics are wanted, and the SUBREG represents
an actual hard register, than the port has to use a proper hard reg instead.
And it makes semantics much saner when we do register allocation for a
pseudo where we don't know the register size to start with.
What would make this still somewhat saner, though, would be if we had
a mechanism to make the subreg mechanism use different word sizes for
different inner modes for the purpose of identifying regions that are
wholely clobbered.  So, if you have 64 bit word_mode, but 32 bit floating
point registers, you could say that SUBREGS for floating point modes should
behave like you had a 32 bit word_mode.
Conversely, if you have 128 bit vector registers, you might want a DImode
subreg of a matching vector mode to clobber the entire pseudo.
Maybe a BITS_PER_REG (MODE) value, or an equivalent hook.

> > Huh?  The documentation says that zero_extract follows BITS_BIG_ENDIAN,
> > so the memory layout doesn't come into play.  We have a 64 bit value,
> > and BITS_BIG_ENDIAN determines which bits are meant.
> 
> So you're saying that, if the above REG:DI were replaced by a MEM:DI,
> the zero_extract would represent a non-contigous bitrange?

Yes, non-contiguous in memory, but contiguous in positional value.
This still gives a port maintainer more freedom than prohibiting the
zero_extract for these modes altogether.  If you like, you can put
a caveat in the documentation to replace the restriction.
Using BITS_BIG_ENDIAN throughout makes perfect sense when you operate on
a value that is generally in a register but might on occaison end up in
memory.

Having an rtx code mean different things when applied to memory rather than
registers is only asking for trouble, since then your operations change
when a register gets spilled to memory.  If we find we really need a
variant of sign_extrace/zero_extract that adhere to BYTES_BIG_ENDIAN /
WORDS_BIG_ENDIAN semantics, we should have a separate rtx code for it.
I don't see how such an operation would actually make sense for bit fields
or integers as such, the only data that comes to mind where you might want
this would be strings.  And then, it would be more natural if you specified
sizes and positions in bytes rather than bits.

> (Yes, the documentation suggests byte_mode for MEMs, but the SH port
> uses zero_extracts of SImode MEMs as well, so presumably we're supposed
> to support other modes besides the documented ones.)

I just realize that it's missing a (mult:Pmode ... (const_int 8)) there.
The unaligned load / stores would certainly be simpler to get right with
a byte-extract rtl code.

SUBREG is actually similar to a byte-extract code, except we have
restrictions on what offsets and mode combinations are allowed, and the
extraction size has to match a mode.
If we had a byte-extract rtl code allowed for rvalues and lvalues, we
could replace strict_low_part, so it would not need to cause an overall
increase in rtl codes.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]