This is the mail archive of the
`gcc-patches@gcc.gnu.org`
mailing list for the GCC project.

Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|

Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |

Other format: | [Raw text] |

*From*: "Dhananjay R. Deshpande" <dhananjayd at kpit dot com>*To*: "Kazu Hirata" <kazu at cs dot umass dot edu>*Cc*: <gcc-patches at gcc dot gnu dot org>*Date*: Wed, 31 Jul 2002 15:17:02 +0530*Subject*: [PATCH] H8300 Shift Optimization

Hi Kazu, Based on the analysis given below, some of the HImode Shifts currently implemented as Loop could be optimized for speed. For H8300 ========= 1. SHIFT_ASHIFT by 5 - This could be inlined. Current implementation takes 42 clock cycles for execution and requires 4 instructions. If inlined it takes 10 cycles and 5 instructions. 2. SHIFT_ASHIFT by 6 - If inlined this would execute in 12 cycles as compared to 50 at the cost of two more instructions. 3. SHIFT_ASHIFT by 13 - Moving low byte to high byte and inlining rest 5 takes 7 instructions and 14 clock cycles while loop takes 4 instructions but 106 clock cycles. 4. SHIFT_ASHIFT by 14 - Loop takes 4 instructions and 114 clock cycles. Instead if low byte is moved to high byte, rotate right twice and AND with 0xC0 takes 5 instructions and executes in 10 clock cycles. 5. SHIFT_LSHFTRT by 13 - Loop takes 5 instructions and 132 clock cycles, while moving high byte to low byte and inlining rest takes 7 instructions and 14 clock cycles. 6. SHIFT_LSHIFTRT by 14 - Loop takes 5 instructions and 142 clock cycles while move byte, rotate left twice, AND with 0x03 takes 5 instructions and 10 clock cycles. 7. SHIFT_ASHIFTRT by 13 - Loop takes 5 instructions and 132 clock cycles, while moving high byte to low byte and inlining rest takes 7 instructions and 14 clock cycles. 8. SHIFT_ASHIFTRT by 14 - Loop takes 5 instructions and 142 clock cycles while if coded as mov.b r0h,r0l shll.b r0l subx.b r0h,r0h shll.b r0l mov.b r0h,r0l bst.b #0,r0l takes 6 instructions and 12 clock cycles. For H8300H ========== 1. SHIFT_ASHIFT by 5 - This could be inlined. Current implementation takes 42 clock cycles for execution and requires 4 instructions. If inlined it takes 10 cycles and 5 instructions. 2. SHIFT_ASHIFT by 6 - If inlined this would execute in 12 cycles as compared to 50 at the cost of two more instructions. 3.SHIFT_LSHIFTRT by 5 - This could be inlined. Current implementation takes 42 clock cycles for execution and requires 4 instructions. If inlined it takes 10 cycles and 5 instructions. 4. SHIFT_LSHIFTRT by 6 - If inlined this would execute in 12 cycles as compared to 50 at the cost of two more instructions. 5.SHIFT_ASHIFTRT by 5 - This could be inlined. Current implementation takes 42 clock cycles for execution and requires 4 instructions. If inlined it takes 10 cycles and 5 instructions. 6. SHIFT_ASHIFTRT by 6 - If inlined this would execute in 12 cycles as compared to 50 at the cost of two more instructions. 7. SHIFT_ASHIFTRT by 13 - Loop takes 4 instructions and 106 clock cycles, while moving high byte to low byte and inlining rest takes 7 instructions and 14 clock cycles. 8. SHIFT_ASHIFTRT by 14 - Loop takes 4 instructions and 114 clock cycles while if coded as shll.b r0h subx.b r0l,r0l shll.b r0h rotxl.b r0l exts.w r0 takes 5 instructions and 10 clock cycles. For H8S ======== 1. SHIFT_ASHIFTRT by 13 - Loop takes 5 instructions and 26 clock cycles, while moving high byte to low byte and inlining rest takes 5 instructions and 5 clock cycles. 2. SHIFT_ASHIFTRT by 14 - Loop takes 4 instructions and 29 clock cycles while if coded as mov.b r0h,r0l exts.w r0 shar.w #2,r0 shar.w #2,r0 shar.w #2,r0 takes 5 instructions and 5 clock cycles. So at the cost of 1 or 2 instructions most of the loops could be avoided and good amount of speed increase could be achieved. Following patch implements the above cases. ======================================================================== *** h8300.c.orig Tue Jul 23 12:17:18 2002 --- h8300.c Wed Jul 24 10:35:46 2002 *************** *** 2047,2056 **** /* 0 1 2 3 4 5 6 7 */ /* 8 9 10 11 12 13 14 15 */ { INL, INL, INL, INL, INL, LOP, LOP, SPC, ! SPC, SPC, SPC, SPC, SPC, LOP, LOP, SPC }, /* SHIFT_ASHIFT */ { INL, INL, INL, INL, INL, LOP, LOP, SPC, ! SPC, SPC, SPC, SPC, SPC, LOP, LOP, SPC }, /* SHIFT_LSHIFTRT */ ! { INL, INL, INL, INL, INL, LOP, LOP, SPC, ! SPC, SPC, SPC, SPC, SPC, LOP, LOP, SPC }, /* SHIFT_ASHIFTRT */ }, { --- 2047,2056 ---- /* 0 1 2 3 4 5 6 7 */ /* 8 9 10 11 12 13 14 15 */ + { INL, INL, INL, INL, INL, INL, INL, SPC, + SPC, SPC, SPC, SPC, SPC, SPC, SPC, SPC }, /* SHIFT_ASHIFT */ { INL, INL, INL, INL, INL, LOP, LOP, SPC, ! SPC, SPC, SPC, SPC, SPC, SPC, SPC, SPC }, /* SHIFT_LSHIFTRT */ { INL, INL, INL, INL, INL, LOP, LOP, SPC, ! SPC, SPC, SPC, SPC, SPC, SPC, SPC, SPC }, /* SHIFT_ASHIFTRT */ }, { *************** *** 2058,2067 **** /* 0 1 2 3 4 5 6 7 */ /* 8 9 10 11 12 13 14 15 */ ! { INL, INL, INL, INL, INL, LOP, LOP, SPC, SPC, SPC, SPC, SPC, SPC, ROT, ROT, ROT }, /* SHIFT_ASHIFT */ ! { INL, INL, INL, INL, INL, LOP, LOP, SPC, SPC, SPC, SPC, SPC, SPC, ROT, ROT, ROT }, /* SHIFT_LSHIFTRT */ ! { INL, INL, INL, INL, INL, LOP, LOP, SPC, ! SPC, SPC, SPC, SPC, SPC, LOP, LOP, SPC }, /* SHIFT_ASHIFTRT */ }, { --- 2058,2067 ---- /* 0 1 2 3 4 5 6 7 */ /* 8 9 10 11 12 13 14 15 */ ! { INL, INL, INL, INL, INL, INL, INL, SPC, SPC, SPC, SPC, SPC, SPC, ROT, ROT, ROT }, /* SHIFT_ASHIFT */ ! { INL, INL, INL, INL, INL, INL, INL, SPC, SPC, SPC, SPC, SPC, SPC, ROT, ROT, ROT }, /* SHIFT_LSHIFTRT */ ! { INL, INL, INL, INL, INL, INL, INL, SPC, ! SPC, SPC, SPC, SPC, SPC, SPC, SPC, SPC }, /* SHIFT_ASHIFTRT */ }, { *************** *** 2074,2078 **** SPC, SPC, SPC, SPC, SPC, ROT, ROT, ROT }, /* SHIFT_LSHIFTRT */ { INL, INL, INL, INL, INL, INL, INL, INL, ! SPC, SPC, SPC, SPC, SPC, LOP, LOP, SPC }, /* SHIFT_ASHIFTRT */ } }; --- 2074,2078 ---- SPC, SPC, SPC, SPC, SPC, ROT, ROT, ROT }, /* SHIFT_LSHIFTRT */ { INL, INL, INL, INL, INL, INL, INL, INL, ! SPC, SPC, SPC, SPC, SPC, SPC, SPC, SPC }, /* SHIFT_ASHIFTRT */ } }; *************** *** 2292,2296 **** } } ! else if (8 <= count && count <= 12) { info->remainder = count - 8; --- 2292,2296 ---- } } ! else if (8 <= count && count <= 13) { info->remainder = count - 8; *************** *** 2315,2318 **** --- 2315,2340 ---- info->shift1 = "shar.b\t%s0"; info->shift2 = "shar.b\t#2,%s0"; + goto end; + } + } + else if (count == 14) + { + switch (shift_type) + { + case SHIFT_ASHIFT: + if (TARGET_H8300) + info->special = "mov.b\t%s0,%t0\n\trotr.b\t%t0\n\trotr.b\t%t0\n\tand.b\t#0xC0,%t0\n\tsub.b\t%s0,%s0"; + goto end; + case SHIFT_LSHIFTRT: + if (TARGET_H8300) + info->special = "mov.b\t%t0,%s0\n\trotl.b\t%s0\n\trotl.b\t%s0\n\tand.b\t#3,%s0\n\tsub.b\t%t0,%t0"; + goto end; + case SHIFT_ASHIFTRT: + if (TARGET_H8300) + info->special = "mov.b\t%t0,%s0\n\tshll.b\t%s0\n\tsubx.b\t%t0,%t0\n\tshll.b\t%s0\n\tmov.b\t%t0,%s0\n\tbst.b\t#0,%s0"; + else if (TARGET_H8300H) + info->special = "shll.b\t%t0\n\tsubx.b\t%s0,%s0\n\tshll.b\t%t0\n\trotxl.b\t%s0\n\texts.w\t%T0"; + else /* TARGET_H8300S */ + info->special = "mov.b\t%t0,%s0\n\texts.w\t%T0\n\tshar.w\t#2,%T0\n\tshar.w\t#2,%T0\n\tshar.w\t#2,%T0"; goto end; } ======================================================================== Regards, Dhananjay ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Free download of GNUSH and GNUH8 tool-chains for Hitachi's SH and H8 Series. The following site also offers free support to European customers. Read more at http://www.kpit.com. Latest versions of GNUSH and GNUH8 are released on July 1, 2002. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|

Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |