This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PowerPC register and memory cost update


> +
> + /* A C expression returning the cost of moving data from a register of class
> +    CLASS1 to one of CLASS2.  */
> +
> + int
> + rs6000_register_move_cost (mode, from, to)
> +      enum machine_mode mode;
> +      enum reg_class from, to;
> + {
> +   /*  Moves from/to GENERAL_REGS.  */
> +   if (reg_classes_intersect_p (to, GENERAL_REGS)
> +       || reg_classes_intersect_p (from, GENERAL_REGS))
> +     {
> +       if (! reg_classes_intersect_p (to, GENERAL_REGS))
> +       from = to;
> +
> +       if (from == FLOAT_REGS || from == ALTIVEC_REGS)
> +       return (rs6000_memory_move_cost (mode, from, 0)
> +               + rs6000_memory_move_cost (mode, GENERAL_REGS, 0));
> +
> + /* It's more expensive to move CR_REGS than CR0_REGS because of the shift...*/
> +       else if (from == CR_REGS)
> +       return 4;
> +
> +       else
> + /* A move will cost one instruction per GPR moved.  */
> +       return 2 * HARD_REGNO_NREGS (0, mode);
> +     }
> +
> + /* Moving between two similar registers is just one instruction.  */
> +   else if (reg_classes_intersect_p (to, from))
> +     return mode == TFmode ? 4 : 2;
> +
> + /* Everything else has to go through GENERAL_REGS.  */
> +   else
> +     return (rs6000_register_move_cost (mode, GENERAL_REGS, to)
> +           + rs6000_register_move_cost (mode, from, GENERAL_REGS));
> + }

This doesn't handle special registers (lr, ctr) specially -- moves between
general and special registers get the same cost as general<->general moves,
and special<->special moves get twice that cost.  The old version didn't
make special<->general more expensive than general<->general either, but at
least it made special<->special really expensive.

I can imagine this hurts a lot on register-starved routines, but I have no
benchmarks to back this up.  So please test or ignore :)

On a related note, looking at scheduling dumps (from -da), it seems to me
that GCC thinks loads have latency of 2 cycles (correct) and throughput of
1 per 2 cycles (incorrect for most cpu's, and certainly for the 7400 i had
it optimize for: it can issue one load per cycle).  This hurts my indirect
threaded code interpreter a lot.

Could you point me at the "guilty" part of GCC?  I just don't seem to be
able to find where the issue rates are described.


Segher


> !    On the RS/6000, copying between floating-point and fixed-point
> !    registers is expensive.  */
> !
> ! #define REGISTER_MOVE_COST(MODE, CLASS1, CLASS2)              \
> !    ((CLASS1) == FLOAT_REGS && (CLASS2) == FLOAT_REGS ? 2      \
> !    : (CLASS1) == FLOAT_REGS && (CLASS2) != FLOAT_REGS ? 10    \
> !    : (CLASS1) != FLOAT_REGS && (CLASS2) == FLOAT_REGS ? 10    \
> !    : (CLASS1) == ALTIVEC_REGS && (CLASS2) != ALTIVEC_REGS ? 20        \
> !    : (CLASS1) != ALTIVEC_REGS && (CLASS2) == ALTIVEC_REGS ? 20        \
> !    : (((CLASS1) == SPECIAL_REGS || (CLASS1) == MQ_REGS                \
> !        || (CLASS1) == LINK_REGS || (CLASS1) == CTR_REGS               \
> !        || (CLASS1) == LINK_OR_CTR_REGS)                               \
> !       && ((CLASS2) == SPECIAL_REGS || (CLASS2) == MQ_REGS     \
> !         || (CLASS2) == LINK_REGS || (CLASS2) == CTR_REGS      \
> !         || (CLASS2) == LINK_OR_CTR_REGS)) ? 10                \
> !    : 2)



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]