This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Patch,AVR]: Fix PR50447 (2/n)


On 09/23/2011 11:36 AM, Georg-Johann Lay wrote:
Paolo Bonzini schrieb:
On 09/23/2011 10:56 AM, Paolo Bonzini wrote:
Also, I am curious about one thing: while this is of course a very
pragmatic solution, you could also convert AVR to get rid of CC0, do
this at expansion time, and get split-wide-types to work as intended.

My changes are just micro-optimizations to print things in a smarter way.

Yes, understood. There are two relatively large changes that we're talking about.


One is changing cc0 to CCmode. By itself it does not have large rewards. It is just an enabler. With CCmode, most of your algorithms would be usable anyway, so very little code would go away.

The second would be a large change instead: since AVR is not schedulable, it got by very well with complex patterns that produce multiple instructions. Changing that and doing the expansion early would be harder than introducing CCmode, but would have a relatively big reward: split-wide-types would actually provide improvements rather than getting in the way most of the time; 8-bit registers could be allocated independently. (There is one notable pessimization, 16-bit add/sub would be used quite rarely). Also, fwprop would be able to perform all the simplifications you're doing in these patches.

compare-elim.c makes it relatively easy to remove CC0 nowadays.

For example, a two-byte add can be written as


    (set (reg:QI L1) (plus:QI (reg:QI L2) (reg:QI L3)))
    (set (reg:QI H1) (plus:QI
                     (plus:QI (req:QI H2) (reg:QI H3))
                     (ltu:QI  (reg:QI L1) (reg:QI L3))))

After reload the second instruction can be split to

   (set (reg:CC_C CC) (compare:QI (reg:QI L1) (reg:QI L3)))
   (set (reg:QI   H1) (plus:QI
                      (plus:QI    (req:QI   H2) (reg:QI H3))
                      (lt:QI      (reg:CC_C CC) (const_int 0))))

What happens if L1=L3? AVR just has two-operand instructions so there are many insns with "0" constraint.

You can use L2 instead of L3, it is the same. If L1=L2=L3, you have a problem. If L1=L2=L3 it's actually an in-place left-shift; you could probably represent it as a PARALLEL and split it after reload to insns for lsl and rol.


You could also represent the 16-bit and 32-bit operations as PARALLELs and split them after reload, but I think it's worse. Using PARALLELs has the advantage that you have access to values of the registers before the instruction; however, it prevents fwprop from doing your simplifications (because fwprop would not find a matching insn for "set the low byte to ~0 and OR the other three bytes").

How goes the LTU -> LT transition?

Typo.


Ans reloading from memory might change Carry. How is that handled? Read
somewhere about that problem in some PR-chat and that this blocked the cc0 ->
CCmode transition (don't know if that's actually a restriction).

You represent everything without CC registers until after reload, so that reload can insert move instructions freely. See the head comment in compare-elim.c.


(i.e. cp+adc). compare-elim will then be able to turn this into add+adc:

   [parallel
     (set (reg:QI   L1) (plus:QI    (reg:QI L2) (reg:QI L3)))
     (set (reg:CC_C CC) (compare:QI
                        (plus:QI    (reg:QI L2) (reg:QI L3))
                                    (reg:QI L3))))]
   (set (reg:QI H1) (plus:QI
                    (plus:QI (req:QI   H2) (reg:QI H3))
                    (lt:QI   (reg:CC_C CC) (const_int 0))))

What about the other flags like Z (zero) and N (negative)? They are not as important as Carry but are still usable to avoid comparisons. Describing all that explicitly in RTL will be quite tedious...

If add sets other flags correctly, it would just use a different mode. I wasn't sure, so I was conservative in my example. You may want to look at the mn10300 port, which in fact only has three modes: CCmode requires the arithmetic instruction to set all of N/Z/V/C, while CCZNCmode and CCZNmode are for subsets.


Paolo


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]