This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] ARM add branchne_decr patterns for ARM and Thumb

This patch adds an optimization to improve a common idiom in a loop 
iteration test:

	while (x--)

by converting the ARM sequence

	sub	rd, rn, #1
	cmn	rd, #1		@ (cmp rd, #-1)
	bne	<dest>


	subs	rd, rn, #1
	bcs	<dest>

It relies on the fact that (ne reg 0) can be implemented as (geu reg 1) 
and that reg and 1 are the inputs to the SUB expression, so we can set the 
flags on the result.

Unfortunately, I've had to do this as a peephole2, since can't get combine 
to generate the correct condition test on the branch instruction while 
preserving the comparison operation (it keeps canonicalizing the 
comparison into a NE).

On thumb, things are more straight-forward in that respect, but there are 
other problems to deal with.  In that case this patch enables us to convert

	mov	lo_t, #1
	neg	lo_t, lo_t
	sub	lo_d, lo_n, #1
	cmp	lo_d, lo_t
	bne	<dest>


	sub	lo_d, lo_n, #1	@ Thumb, so condition codes always set
	bcs	<dest>

Ie a 60% reduction in the code.

Unfortunately, we have to take care to handle some special cases because 
reload cannot handle output-reloads on a JUMP_INSN.   In this case that's 
not a problem, because the two reload cases can be done by inserting 
instructions that do not affect the condition flags between the active 
flag setting instruction and the branch.  Thus if we need to reload into a 
HI register we use

	sub	lo_t, lo_n, #1
	mov	hi_d, lo_t	@ Flags not set
	bne	<dest>

And if the target is a memory (a pseudo that isn't allocated a register) 
then we can use:

	sub	lo_t, lo_n, #1
	str	lo_t, [addr]
	bne	<dest>

Any reloading of the input address can be handled by the existing reload 

Tested on arm-elf for both ARM and Thumb.  In ARM code there are no 
testsuite changes.  In thumb code one regression is fixed (not sure why, 
it might be a branch-range problem where reducing the code size allows the 
compiler to avoid clobbering LR on a branch).


2003-10-07  Richard Earnshaw  <>

	* (cmpsi2_addneg): New ARM pattern. Add peephole2 to generate
	(cbranchne_decr1): New Thumb pattern.
	* arm.c (arm_addimm_operand): New insn predicate.
	* arm-protos.h: Add a prototype for it.
	* arm.h (PREDICATE_CODES): Add it.

Attachment: branchne-dec.patch
Description: branchne-dec.patch

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]