This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] PR target/19597: Rewrite AVR backend's rtx_costs


On Wed, 26 Jan 2005, Paul Schlie wrote:
> So I wonder if a tweak to the patch like below might be prudent;

No.

I'd expect not.  Firstly, the current code size regressions from my
patch are mostly due to increasing the costs of CONST_INT from
COSTS_N_INSNS(1).  The second and most serious problem is that CONST_INT
rtx codes don't have a mode, and always have mode VOIDmode.  In this code:

>       case CONST_INT:
> +     case CONST_DOUBLE:
> -  !    *total = TARGET_INT8 ? COSTS_N_INSNS (1) : COSTS_N_INSNS (2);
> +  !    *total = 1 + COSTS_N_INSNS (GET_MODE_SIZE (mode));
>    !    return true;

your change really doesn't do what you think it does.  The whole
reason my patch used a subroutine "avr_operand_rtx_cost" was to allow
us to pass down the "mode" from the parent expression.  You'll notice
that the cost from CONST_INT in avr_operand_rtx_cost is very different
from the modeless assumptions made at the top-level avr_rtx_costs.
If no information is available, assuming that CONST_INT is by default
the target's "int" seems reasonable (hence the TARGET_INT8 check).


> For mult, as I believe !AVR_ENHANCED implies add&shift loop, possibly:
>    !  case MULT:
> +       if (AVR_ENHANCED)
> +         *total = COSTS_N_INSNS ( 4 * GET_MODE_SIZE (mode) );
> +       else
> +         *total = COSTS_N_INSNS ( 16 * GET_MODE_SIZE (mode) );

The problem here is that, as prior to my patch, you're not taking into
account optimize_size.  On !AVR_ENHANCED processors, multiplications
are implemented by subroutine calls to libgcc which only take one or
two bytes depending upon AVR_MEGA.  Stating that a multiplication
takes 16*GET_MODE_SIZE(mode) unconditionally, is why the middle-end
believes that its preferable to implement an SImode multiplication
by up to 64 shift/add instructions, which is the reason for this PR!!


What's required to remove the "return false" bits of my suggested
patch are accurate profiles, so that we can estimate the average
number of cycles spent in __mulsi3, __mulhi3 and __mulqi3 which
obviously depend heavily upon the distribution of operands that
they're passed in real/typical code.

Roger
--


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]