This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH] PR target/19597: Rewrite AVR backend's rtx_costs
- From: HutchinsonAndy at netscape dot net
- To: roger at eyesopen dot com
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Wed, 26 Jan 2005 19:18:25 -0500
- Subject: Re: [PATCH] PR target/19597: Rewrite AVR backend's rtx_costs
I wholeheartly support the fact that the current avr rtx_cost's are not good enough. If not fixed, many problems are going to come up with every gcc optimisation improvement.
Your solution is very close to one I have been working on. Perhaps I could offer some suggestions based on what I have discovered. This may fix any code expansion - indeed it should get smaller.
If GCC had a) Consistent usage for rtx_cost return values and calling conventions and b) an outer mode!! Life and code would be so much easier!
As you know, rtx_cost is called with all sorts of valid/invalid expressions and various outer codes. So you have to deal with all of them as best as you can. (I dumped the calls and return values to file!)
1) CONST_INT with outer code of SET is (as you have stated) a real pain with no mode information. I deal with this by look at value and make assumptions e.g. 10000=HImode. CONST_DOUBLE=SFmode, LABEL=HImode.
One issue you will find is that you will produce better code if cost of QImode constants with outer code of SET is ZERO. Const_int 0 = ZERO helps but it needs to be all QImode values to get best effect.
I have not totally worked out why but I think its is due to calls.c not putting stuff in hard registers if cost>COST_N_INSN(1). You get reg-reg copies as the standard register allocator is not good enough to get rid of them. There is a similar comparison made in reload.
There is another aspect that supports a cost of zero. The default rtx_cost within gcc is zero for register operands (no overide and independent of mode). So comparitively, any cost of a constant is going to swing gcc toward keeping constants hanging about in registers. That is rarely a win against other usage of the registers.
The downside of having a zero cost is that parts of branch evaluation (ifcvt?) takes ZERO as "unknown cost" and avoids it. So some tweaking here to get an overall balance.
2) My implementation differs in that I evaluate all relevant constant operands before recursion. Most times this is just determining the cost of the operation based on value - not really a constant cost. Mode information is obviously immediately available.
So far thats the same as your code (your shift numbers are way better than mine though!)
When the code recurses, (equiv to you calling avr_rtx_operand) all Constant costs become ZERO when outer code != SET. That way I easily ignore the spurious constant values dotted about in CALL, JUMP and IF_THEN_ELSE expressions. You on the otherhand, apply costs to these - I have no idea what impact that may have but it may compromise comparison/jump optimisations.
3) Addressing is critical to several existing optimisations for bit operations on I/O. Operands that address this area for AND and OR must recognise this and adjust the costs appropriately. If not, the combiner is highly likely to pick a less optimal combination and more code.
Operand cost for I/O MEM < 64 need to prefer direct addressing e.g. . You should end up with ZERO additional cost for this. Combiner pattern recognition expects it a certain way.
4) Address costs are equally messed up and I have combined the two evaluations. This is a bit tricky as its not clear if address cost should (ideally) consider mode or not. Anyway - it cant as again outer mode is not available to the target macro- duh!
What I do is calculate address cost to represent cost of creating address and getting/putting one byte. rtx_cost can however, evaluate fully as long as its an operand (code==MEM). If outer code == MEM, I call address cost.
Its probabley easier to define memory_cost=1 and then have rtx_cost handle all the MEM requests (that seems to be the way gcc handles it when memory_costs are equal.)
Note that when using AVR instruction size (=words) as cost the compiler will tend to use more register indirect [r26,r27] addressing.
When I went thru gcc it would seem that address optimisation does not specifically consider the number of occurrences of a particular address - or the setup cost of a register - thus [Rx] form is immediately attractive. I ended up "discounting" all direct address to prevent this - but I would not conclude that the best way.
I use normal recursion thru return value. I can do this because I have no constants to worry about and have MODE_TIEABLE set - the latter is the only way to override default SUBREG cost.
I think MODES_TIEABLE is incorrect for AVR - perhaps due to description. As far as I can tell it is ok if you can access a mode register in any mode. With the additional constraint that both must be valid registers - but thats another story!
Switch to Netscape Internet Service.
As low as $9.95 a month -- Sign up today at http://isp.netscape.com/register
Netscape. Just the Net You Need.
New! Netscape Toolbar for Internet Explorer
Search from anywhere on the Web and block those annoying pop-ups.
Download now at http://channels.netscape.com/ns/search/install.jsp