This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Why does lower-subreg mark copied pseudos as "decomposable"?


On 17/04/12 18:20, Richard Sandiford wrote:
Andrew Stubbs<ams@codesourcery.com> writes:
Hi all,

I can see why copying from one pseudo-register to another would not be a
reason *not* to decompose a register, but I don't understand why this is
a reason to say it *should* be decomposed.

The idea is that, if a backend implements an N-word pseudo move using N word-mode moves, it is better to expose those moves before register allocation. It's easier for RA to find N separate word-mode registers than a single contiguous N-word one.

Ok, I think I understand that, but it seems slightly wrong to me.


It makes sense to lower *real* moves, but before the fwprop pass there are quite a lot of pseudos that only exist as artefacts of the expand process. Moving the subreg1 pass after fwprop1 would probably do the trick, but that would probably also defeat the object of lowering early.

I've done a couple of experiments:

First, I tried adding an extra fwprop pass before subreg1. I needed to move up the dfinit pass also to make that work, but then it did work: it successfully compiled my testcase without a regression.

I'm not sure that adding an extra pass isn't overkill, so second I tried adjusting lower-subreg to avoid this problem; I modified find_pseudo_copy so that it rejected copies that didn't change the mode, on the principle that fwprop would probably have eliminated the move anyway. This was successful also, and a much less expensive change.

Does that make sense? The pseudos involved in the move will still get lowered if the other conditions hold.

The problem is the "if a backend implements ..." bit: the current code
doesn't check.  This patch:

http://gcc.gnu.org/ml/gcc-patches/2012-04/msg00094.html

should help.  It's still waiting for me to find a case where the two
possible ways of handling hot-cold partitioning behave differently.

I've not studied that patch in detail, but I'm not sure it'll help. In most cases, including my testcase, lowering is the correct thing to do if NEON (or IWMMXT, perhaps) is not enabled. When NEON is enabled, however, it may still be the right thing to do: NEON does not provide a full set of DImode operations. The test for subreg-only uses ought to be enough to differentiate, once the extraneous pseudos such as the one in my testcase have been dealt with.


Anyway, please let me know what you think of my solutions above, and I'll cook up a patch if they're ok.

Andrew


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]