This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Why does lower-subreg mark copied pseudos as "decomposable"?

From: Andrew Stubbs <ams at codesourcery dot com>
To: <gcc at gcc dot gnu dot org>, <rdsandiford at googlemail dot com>
Date: Wed, 18 Apr 2012 12:44:00 +0100
Subject: Re: Why does lower-subreg mark copied pseudos as "decomposable"?
References: <4F8D8E21.5070702@codesourcery.com> <87hawil1fr.fsf@talisman.home> <4F8DE0E6.4060603@codesourcery.com> <g4obqpia1g.fsf@richards-thinkpad.stglab.manchester.uk.ibm.com>

On 18/04/12 11:55, Richard Sandiford wrote:

The problem is that not all register moves are always going to be
eliminated, even when no mode changes are involved.  It might make
sense to restrict that code you quoted:

	    case SIMPLE_PSEUDO_REG_MOVE:
	      if (MODES_TIEABLE_P (GET_MODE (x), word_mode))
		bitmap_set_bit (decomposable_context, regno);
	      break;

to the second pass though.

Yes, I thought of that, but I dismissed it because the second pass is really very late. It would be just in time to take advantage of the relaxed register allocation, but would miss out on all the various optimizations that forward-propagation, combining, and such can offer.

This is why I've tried to find a way to do something about it in the first pass. I thought it makes sense to do something for none-no-op moves (when is there such a thing, btw, without it being and extend, truncate, or subreg?), but the no-op moves are trickier.

Perhaps a combination of the two ideas? Decompose mode-changing moves in the first pass, and all moves in the second?

BTW, the lower-subreg pass has a forward propagation concept of its own. If I read it right, even with the above changes, it will still decompose the move if the register it copies from has been decomposed, and the register it copies to is not marked 'non-decomposable'.

Hmm, I'm going to try to come up with some testcases that demonstrate the different cases and see if that helps me think about it. Do you happen to have any to hand?

I've not studied that patch in detail, but I'm not sure it'll help. In
most cases, including my testcase, lowering is the correct thing to do
if NEON (or IWMMXT, perhaps) is not enabled.


Right.  I think I misunderstood, sorry.  I thought this regression was
for NEON only, but do you mean that adding these NEON patterns introduces
the regression for non-NEON targets as well?

No, you were right, the regression only occurs when NEON is enabled. Otherwise the machine description behaves exactly as it used to.

When NEON is enabled, however, it may still be the right thing to do:
NEON does not provide a full set of DImode operations. The test for
subreg-only uses ought to be enough to differentiate, once the
extraneous pseudos such as the one in my testcase have been dealt
with.


OK.  If/when that patches goes in, the ARM backend is going to have
to pick an rtx cost for DImode SETs.  It sounds like the cost will need
to be twice an SImode move regardless of whether or not NEON is enabled.

That sounds reasonable. Of course, how much a register move costs is a tricky subject for NEON anyway. :(

Andrew

Follow-Ups:
- Re: Why does lower-subreg mark copied pseudos as "decomposable"?
  - From: Richard Sandiford

References:
- Why does lower-subreg mark copied pseudos as "decomposable"?
  - From: Andrew Stubbs
- Re: Why does lower-subreg mark copied pseudos as "decomposable"?
  - From: Richard Sandiford
- Re: Why does lower-subreg mark copied pseudos as "decomposable"?
  - From: Andrew Stubbs
- Re: Why does lower-subreg mark copied pseudos as "decomposable"?
  - From: Richard Sandiford

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]