This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PR target/17101: question about powerpc s<cond> expanders


On Fri, Nov 12, 2004 at 05:00:34PM +0000, Nathan Sidwell wrote:
> PR 17101 is a problem with boolean operations.  rs6000.md contains
> seq, sne, sgt ... expanders, but the signed ones are explicitly
> disabled for non-POWER (i.e. POWERPC) targets. for instance
> 
> (define_expand "sgt"
>   [(clobber (match_operand:SI 0 "gpc_reg_operand" ""))]
>   ""
>   "
> {
>   if (! rs6000_compare_fp_p
>       && (! TARGET_POWER || rs6000_compare_op1 == const0_rtx))
>     FAIL;
> 
>   rs6000_emit_sCOND (GT, operands[0]);
>   DONE;
> }")
> 
> why is that?  The unsigned variants are not so encumbered.
> 
> Also, the signed variants cut out comparisons with zero -- why
> do the unsigned ones not do so? Oversight?  In addition, some of
> the special cases seem ineffective at best, pessimizing at worst.
> Examining each in detail I find the following assembler for 'V OP 0'
> 
> .seq: // baseline of 3 insns using the condition regs
>         cmpwi 7,3,0
>         mfcr 3
>         rlwinm 3,3,31,1

The shortest and fastest here is likely:
	  cntlzw 3,3
	  srwi 3,3,5 
i.e. no mfcr, which has dependencies on all the cr fields 
except on Power4 and later where there is a single field mfcr.
Put a xor or sub in front for any value to compare with.

> 
> .sne: // hm, four insns emitted
>         srawi 0,3,31
>         xor 3,0,3
>         subf 3,3,0
>         srwi 3,3,31

unless I'm mistaken, it's essentially a negative absolute
value shifted right by 31. And it works only for 0, we can
and should do better, for example the following should work
to evaluate x!=y for any x and y, without even clobbering 
the carry:
	sub t1,x,y
	sub t2,x,y
	or t1,t1,t2
	srwi t1,t1,31
(same size, but the first two can be executed in 
parallel, if any is negative, the result is true).
In the case of y=0, the special case reduces to:
	neg t1,x
	or t1,t1,x
	srwi t1,t1,31
which saves one instruction and reduces dependency
length by one.

> 
> .sge: // 2 insns, this is better.
>         nor 3,3,3
>         srwi 3,3,31

No way to improve in simple cases, there are many equivalent 
solutions however which may turn out to be better when you have 
a complex logic expression. For example if you generate:
	srwi 3,3,31
	xori 3,3,1 
the xor might be absorbed in a negation since Power/PPC has the 
full complement of 8 logical operations (and, andc, nand, or, orc
nor, xor, eqv). When writing short chunks of assembly code, I've
never had to worry about getting the right "polarity" for inputs 
of logical instructions (when immediates are not involved). 

> .sgt: // 3 insns, no better
>         srawi 0,3,31
>         subf 0,3,0
>         srwi 0,0,31

Using a variant of the code sequence for sle:
	addi 0,3,-1
	nor 0,0,3
	srwi 0,0,31
you avoid the srawi that clobbers the carry but that's the only 
(really minor) improvement I can think of, it only illustrates 
what I said about the polarity.

> 
> .sle: // 3 insns, no better
>         addi 0,3,-1
>         or 0,0,3
>         srwi 0,0,31

better than any mfcr based solution.

> 
> .slt: // 1 insn, yay!
>         srwi 3,3,31

That one is rather obvious.

	Regards,
	Gabriel


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]