This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: PR target/17101: question about powerpc s<cond> expanders
- From: Gabriel Paubert <paubert at iram dot es>
- To: Nathan Sidwell <nathan at codesourcery dot com>
- Cc: gcc <gcc at gcc dot gnu dot org>, David Edelsohn <dje at watson dot ibm dot com>, Geoffrey Keating <geoffk at apple dot com>
- Date: Tue, 16 Nov 2004 00:14:07 +0100
- Subject: Re: PR target/17101: question about powerpc s<cond> expanders
- References: <4194EC32.9080308@codesourcery.com>
On Fri, Nov 12, 2004 at 05:00:34PM +0000, Nathan Sidwell wrote:
> PR 17101 is a problem with boolean operations. rs6000.md contains
> seq, sne, sgt ... expanders, but the signed ones are explicitly
> disabled for non-POWER (i.e. POWERPC) targets. for instance
>
> (define_expand "sgt"
> [(clobber (match_operand:SI 0 "gpc_reg_operand" ""))]
> ""
> "
> {
> if (! rs6000_compare_fp_p
> && (! TARGET_POWER || rs6000_compare_op1 == const0_rtx))
> FAIL;
>
> rs6000_emit_sCOND (GT, operands[0]);
> DONE;
> }")
>
> why is that? The unsigned variants are not so encumbered.
>
> Also, the signed variants cut out comparisons with zero -- why
> do the unsigned ones not do so? Oversight? In addition, some of
> the special cases seem ineffective at best, pessimizing at worst.
> Examining each in detail I find the following assembler for 'V OP 0'
>
> .seq: // baseline of 3 insns using the condition regs
> cmpwi 7,3,0
> mfcr 3
> rlwinm 3,3,31,1
The shortest and fastest here is likely:
cntlzw 3,3
srwi 3,3,5
i.e. no mfcr, which has dependencies on all the cr fields
except on Power4 and later where there is a single field mfcr.
Put a xor or sub in front for any value to compare with.
>
> .sne: // hm, four insns emitted
> srawi 0,3,31
> xor 3,0,3
> subf 3,3,0
> srwi 3,3,31
unless I'm mistaken, it's essentially a negative absolute
value shifted right by 31. And it works only for 0, we can
and should do better, for example the following should work
to evaluate x!=y for any x and y, without even clobbering
the carry:
sub t1,x,y
sub t2,x,y
or t1,t1,t2
srwi t1,t1,31
(same size, but the first two can be executed in
parallel, if any is negative, the result is true).
In the case of y=0, the special case reduces to:
neg t1,x
or t1,t1,x
srwi t1,t1,31
which saves one instruction and reduces dependency
length by one.
>
> .sge: // 2 insns, this is better.
> nor 3,3,3
> srwi 3,3,31
No way to improve in simple cases, there are many equivalent
solutions however which may turn out to be better when you have
a complex logic expression. For example if you generate:
srwi 3,3,31
xori 3,3,1
the xor might be absorbed in a negation since Power/PPC has the
full complement of 8 logical operations (and, andc, nand, or, orc
nor, xor, eqv). When writing short chunks of assembly code, I've
never had to worry about getting the right "polarity" for inputs
of logical instructions (when immediates are not involved).
> .sgt: // 3 insns, no better
> srawi 0,3,31
> subf 0,3,0
> srwi 0,0,31
Using a variant of the code sequence for sle:
addi 0,3,-1
nor 0,0,3
srwi 0,0,31
you avoid the srawi that clobbers the carry but that's the only
(really minor) improvement I can think of, it only illustrates
what I said about the polarity.
>
> .sle: // 3 insns, no better
> addi 0,3,-1
> or 0,0,3
> srwi 0,0,31
better than any mfcr based solution.
>
> .slt: // 1 insn, yay!
> srwi 3,3,31
That one is rather obvious.
Regards,
Gabriel