This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Rename/export combine_insn_cost as insn_rtx_cost


On Thu, 15 Jul 2004, Zack Weinberg wrote:
> Some targets (ARM, ia64) have the ability to conditionalize an entire
> call sequence.  I'm not sure about ARM, but on ia64 being able to
> rewrite
>
>         (p6) br.cond.dptk .L1
>         ;;
>         .mib
>         nop.m 0
>         nop.i 0
>         br.call.sptk.many b0 = foo#
>         ;;
>         .mii
>         mov r1 = r35
>         nop.i 0
>         nop.i 0
> .L1:
>
> into
>
>         (p6) br.call.dpnt.many b0 = foo#
>         ;;
>         (p6) mov r1 = r35
>
> is definitely a win.  Now I don't think we've ever been able to do
> this, but I would hate to rule out the possibility always and
> forevermore.

Certainly.  There was a discussion about this in one of the posts prior
to my patch: http://gcc.gnu.org/ml/gcc-patches/2004-07/msg00597.html

To summarise there are two classes of optimization that occur in the
if-conversion pass, and each requires a different type of cost estimate.

The first is to conditionalize a basic block:

	if (! cond)	=>	if (cond) call func
	  goto L1
	call func
L1:

for which the relevant measure is the number of conditionally executed
instructions, or as Richard Earnshaw pointed out for newer ARM chips
the unexecuted instruction cost.  At the moment we don't have an API
to support unexectuted_insn_cost, so we assume the cost of not executing
a conditional instruction is one.  [If someone were ever to write an
arm_unexecuted_rtx_cost (rtx pat) in arm.c, where "pat" was the equivalent
unconditional pattern, I'd be happy to turn it into a taret hook that
would get called from the appropriate places in ifcvt.c].



The second class of transformations is to speculatively execute a
set of instructions.

	if (! cond)	=>	x = func()
	  goto L1		cmov cond, x, y
	x = func()
	goto L2
L1:	x = y
L2:


In these transformations, we execute instructions on a path where they
weren't executed before, and it is therefore the timing costs of these
instructions that are relevant.


My changes to insn_rtx_cost concern this second class of transformation,
whilst the example you cite above converns the first class, which uses
the instruction count measure for conditional execution.  Indeed, GCC
should currently be able to perform your transformation provided that
cond_exec_process_if_block is able to synthesize a recognizable
conditional call instruction.  Notice that your transformation is
beneficial irrespective of the actual function being called, hence it
should be independent of the CALL's rtx_cost.


But I also agree "never say never".  On some architectures there may be
less overhead in calling a perfectly predictable unconditional subroutine
call than mispredicting a conditional jump, provided that the function
itself is short enough.  The reason I wrote "yet" in my posting, is that
GCC's infrastructure is currently unable to provide an reasonable timing
estimate for a function, in the same way as it can provide a size estimate
for tree-inlining.  However, one day we may reach the point of being able
to unconditionalize (speculatively execute) function calls...


Roger
--


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]