PowerPC register and memory cost update

Geoff Keating geoffk@geoffk.org
Thu Oct 24 15:42:00 GMT 2002


Segher Boessenkool <segher@koffie.nl> writes:

> > +
> > + /* A C expression returning the cost of moving data from a register of class
> > +    CLASS1 to one of CLASS2.  */
> > +
> > + int
> > + rs6000_register_move_cost (mode, from, to)
> > +      enum machine_mode mode;
> > +      enum reg_class from, to;
> > + {
> > +   /*  Moves from/to GENERAL_REGS.  */
> > +   if (reg_classes_intersect_p (to, GENERAL_REGS)
> > +       || reg_classes_intersect_p (from, GENERAL_REGS))
> > +     {
> > +       if (! reg_classes_intersect_p (to, GENERAL_REGS))
> > +       from = to;
> > +
> > +       if (from == FLOAT_REGS || from == ALTIVEC_REGS)
> > +       return (rs6000_memory_move_cost (mode, from, 0)
> > +               + rs6000_memory_move_cost (mode, GENERAL_REGS, 0));
> > +
> > + /* It's more expensive to move CR_REGS than CR0_REGS because of the shift...*/
> > +       else if (from == CR_REGS)
> > +       return 4;
> > +
> > +       else
> > + /* A move will cost one instruction per GPR moved.  */
> > +       return 2 * HARD_REGNO_NREGS (0, mode);
> > +     }
> > +
> > + /* Moving between two similar registers is just one instruction.  */
> > +   else if (reg_classes_intersect_p (to, from))
> > +     return mode == TFmode ? 4 : 2;
> > +
> > + /* Everything else has to go through GENERAL_REGS.  */
> > +   else
> > +     return (rs6000_register_move_cost (mode, GENERAL_REGS, to)
> > +           + rs6000_register_move_cost (mode, from, GENERAL_REGS));
> > + }
> 
> This doesn't handle special registers (lr, ctr) specially -- moves between
> general and special registers get the same cost as general<->general moves,
> and special<->special moves get twice that cost.  The old version didn't
> make special<->general more expensive than general<->general either, but at
> least it made special<->special really expensive.

In this context, a 'special<->special' move is a no-op, which is
really very cheap :-).  All the special registers have register
classes of their own, and the calling code handles superclasses.

I don't know whether special<->general is more expensive than
general<->general.  Since LR and CTR usually have shadow registers,
I'd expect they really are about the same.

> I can imagine this hurts a lot on register-starved routines, but I have no
> benchmarks to back this up.  So please test or ignore :)
> 
> On a related note, looking at scheduling dumps (from -da), it seems to me
> that GCC thinks loads have latency of 2 cycles (correct) and throughput of
> 1 per 2 cycles (incorrect for most cpu's, and certainly for the 7400 i had
> it optimize for: it can issue one load per cycle).  This hurts my indirect
> threaded code interpreter a lot.
> 
> Could you point me at the "guilty" part of GCC?  I just don't seem to be
> able to find where the issue rates are described.

Look at the top of rs6000.md, the 'define_function_unit' declarations.

-- 
- Geoffrey Keating <geoffk@geoffk.org>



More information about the Gcc-patches mailing list