[PATCH][RFA][PR target/15184] Partial fix for direct byte access on x86

Segher Boessenkool segher@kernel.crashing.org
Tue Jan 27 21:55:00 GMT 2015


On Tue, Jan 27, 2015 at 12:27:38PM -0700, Jeff Law wrote:
> On 01/26/15 22:11, Segher Boessenkool wrote:
> >On Mon, Jan 26, 2015 at 08:07:29PM -0700, Jeff Law wrote:
> >>The second change we need is an additional simplification.
> >>
> >>If we have
> >>(subreg:M1 (zero_extend:M2 (x))
> >>
> >>Where M1 > M2 and both are scalar integer modes.  It's advantageous to
> >>strip the SUBREG and instead have a wider extension.
> >
> >Should you also check M1 is not multiple registers?
> We're generally working with pseudos, so we could estimate, but not know 
> for sure if we're dealing with multiple hard regs.  But more 
> importantly, I'm not sure what that check would buy us.

I mean e.g. DI on a 32-bit target.  My worry is that zero_extend:DI then
is more expensive -- if say, it is implemented as a split, combine itself
cannot get rid of the redundancy.

> Earlier versions checked reg_equal_p on the MEM.  But that's often a 
> mistake because the modes of the two memory references may be different. 
>  I don't recall which of the various tests, but I was definitely seeing 
> SImode in the load and HImode in the store.
> 
> Similarly you don't want to check reg_equal_p on the addresses as they 
> aren't necessarily the same either (they're obviously related).
> 
> That's how I ultimately settled on rtx_referenced_p form you see above. 
>  I'm still not sure that's 100% what I want, but I don't have any tests 
> yet which require something more complex.

Okay, if there are actual real cases like that :-)  All this code does
is cull cases that are not useful to try to combine, since without that
combining four insns is very expensive.

> >Does this do anything good for the "dec mem" thing on x86?  That would
> >be a nice bonus :-)
> It might, but I haven't tested for that specifically.  If you've got 
> sample code or a PR in mind, pass it along and I'll take a look.  I'd 
> think dec mem would generally be handled by 3->1 insn combination code 
> unless there's something else going on.

I do have a specific PR in mind, but I cannot currently find it.  It was
about x86, dec mem and then using the flags...  Must have sent 100 emails
in that thread...  And cannot find it now!


Segher



More information about the Gcc-patches mailing list