[PATCH GCC][4/4]Simplify (cond (cmp x c1) (op x c2) c3) -> (op (minmax x c1) c2)

Bin Cheng Bin.Cheng@arm.com
Tue Oct 25 11:22:00 GMT 2016


Hi,
As commented in patch, this one simplifies (cond (cmp x c1) (op x c2) c3) into (op (minmax x c1) c2) if:

     1) OP is PLUS or MINUS.
     2) CMP is LT, LE, GT or GE.
     3) C3 == (C1 op C2), and the experation isn't undefined behavior.

   This pattern also handles special cases like:

     1) Comparison's operand x is a unsigned to signed type conversion
	and c1 is integer zero.  In this case,
	  (signed type)x  < 0  <=>  x  > MAX_VAL(signed type)
	  (signed type)x >= 0  <=>  x <= MAX_VAL(signed type)
     2) Const c1 may not equal to (C3 op' C2).  In this case we also
	check equality for (c1+1) and (c1-1) by adjusting comparison
	code.

Also note: Though signed type is handled by this pattern, it cannot be simplified at the moment because C standard requires additional type promotion.  In order to match&simplify signed type cases, the IR needs to be cleaned up by other optimizers, i.e, VRP.
For given loop:
int foo1 (unsigned short a[], unsigned int x)
{
  unsigned int i;
  for (i = 0; i < 1000; i++)
    {
      x = a[i];
      a[i] = (unsigned short)(x <= 32768 ? x + 32768 : 0);
    }
  return x;
}

Generated assembly can be improved from:
.L4:
	ldr	q5, [x3, x1]
	add	w2, w2, 1
	cmp	w0, w2
	ushll	v1.4s, v5.4h, 0
	ushll2	v0.4s, v5.8h, 0
	add	v4.4s, v1.4s, v2.4s
	add	v3.4s, v0.4s, v2.4s
	cmhs	v1.4s, v2.4s, v1.4s
	cmhs	v0.4s, v2.4s, v0.4s
	and	v1.16b, v4.16b, v1.16b
	and	v0.16b, v3.16b, v0.16b
	xtn	v3.4h, v1.4s
	xtn2	v3.8h, v0.4s
	str	q3, [x3, x1]
	add	x1, x1, 16
	bhi	.L4

To:
.L4:
	ldr	q1, [x3, x1]
	add	w2, w2, 1
	cmp	w0, w2
	umin	v0.8h, v1.8h, v2.8h
	add	v0.8h, v0.8h, v2.8h
	str	q0, [x3, x1]
	add	x1, x1, 16
	bhi	.L4

Bootstrap and test on x86_64 and AArch64 for whole patch set.  Any comments?

Thanks,
bin

2016-10-21  Bin Cheng  <bin.cheng@arm.com>

	* match.pd ((cond (cmp x c1) (op x c2) c3) -> (op (minmax x c1) c2)):
	New pattern.

gcc/testsuite/ChangeLog
2016-10-21  Bin Cheng  <bin.cheng@arm.com>

	* gcc.dg/fold-bopcond-1.c: New test.
	* gcc.dg/fold-bopcond-2.c: New test.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: 04-simplify-bopcond-20161021.txt
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20161025/fd352cdb/attachment.txt>


More information about the Gcc-patches mailing list