PowerPC register and memory cost update
Geoff Keating
geoffk@geoffk.org
Thu Oct 24 15:42:00 GMT 2002
Segher Boessenkool <segher@koffie.nl> writes:
> > +
> > + /* A C expression returning the cost of moving data from a register of class
> > + CLASS1 to one of CLASS2. */
> > +
> > + int
> > + rs6000_register_move_cost (mode, from, to)
> > + enum machine_mode mode;
> > + enum reg_class from, to;
> > + {
> > + /* Moves from/to GENERAL_REGS. */
> > + if (reg_classes_intersect_p (to, GENERAL_REGS)
> > + || reg_classes_intersect_p (from, GENERAL_REGS))
> > + {
> > + if (! reg_classes_intersect_p (to, GENERAL_REGS))
> > + from = to;
> > +
> > + if (from == FLOAT_REGS || from == ALTIVEC_REGS)
> > + return (rs6000_memory_move_cost (mode, from, 0)
> > + + rs6000_memory_move_cost (mode, GENERAL_REGS, 0));
> > +
> > + /* It's more expensive to move CR_REGS than CR0_REGS because of the shift...*/
> > + else if (from == CR_REGS)
> > + return 4;
> > +
> > + else
> > + /* A move will cost one instruction per GPR moved. */
> > + return 2 * HARD_REGNO_NREGS (0, mode);
> > + }
> > +
> > + /* Moving between two similar registers is just one instruction. */
> > + else if (reg_classes_intersect_p (to, from))
> > + return mode == TFmode ? 4 : 2;
> > +
> > + /* Everything else has to go through GENERAL_REGS. */
> > + else
> > + return (rs6000_register_move_cost (mode, GENERAL_REGS, to)
> > + + rs6000_register_move_cost (mode, from, GENERAL_REGS));
> > + }
>
> This doesn't handle special registers (lr, ctr) specially -- moves between
> general and special registers get the same cost as general<->general moves,
> and special<->special moves get twice that cost. The old version didn't
> make special<->general more expensive than general<->general either, but at
> least it made special<->special really expensive.
In this context, a 'special<->special' move is a no-op, which is
really very cheap :-). All the special registers have register
classes of their own, and the calling code handles superclasses.
I don't know whether special<->general is more expensive than
general<->general. Since LR and CTR usually have shadow registers,
I'd expect they really are about the same.
> I can imagine this hurts a lot on register-starved routines, but I have no
> benchmarks to back this up. So please test or ignore :)
>
> On a related note, looking at scheduling dumps (from -da), it seems to me
> that GCC thinks loads have latency of 2 cycles (correct) and throughput of
> 1 per 2 cycles (incorrect for most cpu's, and certainly for the 7400 i had
> it optimize for: it can issue one load per cycle). This hurts my indirect
> threaded code interpreter a lot.
>
> Could you point me at the "guilty" part of GCC? I just don't seem to be
> able to find where the issue rates are described.
Look at the top of rs6000.md, the 'define_function_unit' declarations.
--
- Geoffrey Keating <geoffk@geoffk.org>
More information about the Gcc-patches
mailing list