This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH, x86_64]: Provide longlong.h definitions for 128bit operations


Hi,

On Tue, 22 May 2007, Ian Lance Taylor wrote:

> Roman Zippel <zippel@linux-m68k.org> writes:
> 
> > I'm playing around with it a little on m68k and there is sometimes a bit 
> > of a conflict. E.g. I would split somthing like adddi3/subdi3 already 
> > before combine, because the addx/subx instruction is much more limited in 
> > its addressing mode (only accepts register), so combine can generate the 
> > right instructions.
> 
> That's right.  That's why I recommend a define_insn_and_split.  The
> insn will be complete for most of the RTL optimization passes,
> including combine.  It will then be split before the second lower
> subreg pass and before register allocation.

Somehow I think this can't be everything. Let's take a different example, 
which is also relevant for other ports: x + (y << 32). Here one has a 
partial constant argument, but one doesn't really know it until after 
splitting the shift. This could be optimized to a simple move/add (but 
currently isn't).
With a post-combine-split one could now generate a combined add/shift 
pattern which could be splitted into the simpler operations, but this is 
too late for combine. A pre-combine-split is also problematic, as the 
information that the carry flag is zero, can't get to the next 
instruction, as it's usually hidden in a rather complex parallel pattern.

To handle this one could either rerun combine after split (which may 
be done more than once) or we need a combined combine/split pass (which 
selectively just combines the newly created instructions with other 
instructions).

> Of course, that won't help your case, because you want to split before
> combine for other reasons.

There are more reasons, e.g. splitting post_inc/pre_dec operations later 
is rather painful, if would be so much easier if one only had to deal with 
simple memory operands and leave the bulk work to combine.

> > I think the reload problem may not be that bad, while a move instruction 
> > sets the cc register, it often just sets it in the same way as the 
> > previous instructions, especially the Z and M flag, which is enough for 
> > the common test against zero. For these tests the move wouldn't really 
> > clobber the flags, but just recreate them.
> 
> It's unfortunately not that simple.  In some cases reload will need to
> load some part of an address.  When it does that the flags will no
> longer have any relation to what they are supposed to have.

This requires an output reload, doesn't it? If we only have single output, 
it would mean the final move would recreate the right flags.
For floating point values this would be even easier, the post processing 
for setting the condition codes is always the same and move to memory 
doesn't even change them.

> > A bit more difficult is a sub/cmp combination, here move would really 
> > clobber flags, but luckily on m68k the sub instruction is quite flexible 
> > and e.g. also accepts memory arguments as destination, so it would be 
> > possible to convert a output reload into one or two input reloads.
> > So what basically would be needed is a port specific check in reload as to 
> > whether an output reload is acceptable and possibly reject the 
> > alternative and try it with another one.
> > That's currently the idea I would favour at least for m68k, does that 
> > sound reasonable (or at least understandable :) ).
> 
> That sounds unreasonable.  reload is already very complicated, too
> complicated, and there isn't going to be any way to drop in something
> like that without making it significantly more complicated.

Considering the alternatives I don't want dismiss this option too quickly.
What is suggested in the wiki would only trade one problem with another, 
cc0 might be gone this way, but one also had none of the benefits and 
needs a lot of extra work to get anywhere near where it was before.
The problem _is_ reload, so I'd prefer to look at the real problem instead 
of wasting a lot of effort to work around it.
Why are you so sure that there is not "any way"? Yes, reload is 
complicated, but I don't think these changes would make it significantly 
more complicated than it already is and I'm pretty sure that it will be 
less complicated than any of the alternatives.

BTW there is a move instruction, which doesn't modify the flags - movem,
but I would really prefer to only use it as last effort.

bye, Roman


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]