[rs6000] Add support for signed overflow arithmetic

Segher Boessenkool segher@kernel.crashing.org
Mon Oct 24 17:31:00 GMT 2016


On Mon, Oct 24, 2016 at 06:14:48PM +0200, Eric Botcazou wrote:
> > Maybe the best you can do is generate the double-width result, and then
> > check if the upper halve is the sign extension of the lower halve.  Maybe
> > some trickery can help (for add/sub/neg at least).
> 
> That's inefficient, even for additive operations.

It's better than the generic branch sequence below, or yours.  It still
sucks, obviously.

Let's see.  Completely untested.  Inputs in regs 3 and 4, output in reg 3.
32-bit code all the way.

add:
	eqv 9,3,4
	add 3,3,4
	xor 4,3,4
	and. 4,9,4
	blt <overflow>

sub:
	xor 9,3,4
	sub 3,3,4
	eqv 4,3,4
	and. 4,9,4
	blt <overflow>

neg:
	neg 3,3
	xoris. 9,3,0x8000
	beq <overflow>

mul:
	mulhw 9,3,4
	mullw 3,3,4
	srawi 4,9,31
	cmpw 4,9
	bne <overflow>

> > You can also just FAIL the expander if !TARGET_MCRXR.  I wonder just how
> > bad the generic code is.
> 
> It is branchy.  Here's a 32-bit overflow addition at -O2:
> 
> 	cmpwi 7,4,0
> 	add 4,3,4
> 	blt- 7,.L4
> 	cmpw 7,4,3
> 	blt- 7,.L3
> .L5:
> 	mr 3,4
> 	blr
> .L4:
> 	cmpw 7,4,3
> 	ble+ 7,.L5
> .L3:
> 	<overflow>
> 
> You can do it manually with just one branch:
> 
> 	add 10,4,3
> 	srwi 4,4,31
> 	cmpw 7,10,3
> 	mfcr 9
> 	rlwinm 9,9,29,1
> 	cmpw 7,9,4
> 	bne- 7,.L5
> 	mr 3,10
> 	blr
> .L5:
> 	<overflow>
> 
> and of course with -mmcrxr:
> 
> 	addo 3,3,4
> 	mcrxr 7
> 	bgt- 7,.L10
> 	blr
> L10:
> 	<overflow>

Or using mcrxr (or mtxer) and SO:

	mcrxr 0 # clear XER[SO], can use mtxer instead
	...
	addo. 3,3,4
	bso .L10
	blr
.L10:
	etc.

(but keeping track of when your SO flag is clear is a pain, and if you
have to reset it all the time there is no big win).


Segher



More information about the Gcc-patches mailing list