[PATCH] Optimize "a || b" into "(a | b) != 0"
Roger Sayle
roger@eyesopen.com
Fri May 17 23:48:00 GMT 2002
In a recent post to gcc-patches, I explained that a bit-wise and
couldn't be used to implement C's && operator. However, it is
possible to implement C's || operator with a bit-wise or.
The following patch converts "(a != 0) || (b != 0)" into the
equivalent "(a | b) != 0" and converts "(a == 0) && (b == 0)"
into the equivalent "(a | b) == 0". These transformations are
only applied when BRANCH_COST >= 2 and "b" can be evaluated
unconditionally and its operands are simple.
Consider the following two functions:
int
foo (int a, int b)
{
return a || b;
}
int
bar (int a, int b)
{
return !a && !b;
}
On the alpha, CVS mainline currently generates the following code
foo: cmpult $31,$16,$0
cmpult $31,$17,$17
bis $0,$17,$0
ret $31,($26),1
bar: cmpeq $16,0,$0
cmpeq $17,0,$17
and $0,$17,$0
ret $31,($26),1
with the patch below we now generate the shorter sequences
foo: bis $16,$17,$0
cmpult $31,$0,$0
ret $31,($26),1
bar: bis $16,$17,$0
cmpeq $0,0,$0
ret $31,($26),1
For the IA-32, the improvement is even more pronounced. On CVS we
generate the following with "-O2 -fomit-frame-point -mbranch-cost=2"
foo: movl 4(%esp), %edx
movl 8(%esp), %eax
testl %edx, %edx
setne %dl
testl %eax, %eax
setne %al
orl %edx, %eax
andl $1, %eax
ret
bar: movl 4(%esp), %eax
movl 8(%esp), %ecx
testl %eax, %eax
sete %dl
testl %ecx, %ecx
sete %al
andl %edx, %eax
andl $1, %eax
ret
With this patch applied and with the same options we now get
foo: movl 8(%esp), %eax
orl 4(%esp), %eax
setne %al
movzbl %al, %eax
ret
bar: movl 8(%esp), %eax
orl 4(%esp), %eax
sete %al
movzbl %al, %eax
ret
This patch has been tested by a complete "make bootstrap" and
"make -k check" on both i686-pc-linux-gnu and alphaev6-dec-osf5.1
with no new regressions. Because x86 has a default BRANCH_COST
of one this transformation wasn't triggered during bootstrap, but
on alpha (with a default BRANCH_COST of 5) both transformations
fired several hundred times.
I've not included a testcase, as its difficult to check whether
the transformation has been applied, especially when its platform
and compiler flag dependent, and there seems no need for an extra
correctness test of "a || b" as if this ever stopped working we'd
hear about it almost immediately in stage2 :>.
One comment on the implementation. I take the cautious approach
of testing that the types of "a" and "b" are identical. In theory,
the transformation could easily be improved by either zero (or sign)
extending the narrower type to the wider type. I wasn't sure of
the best way to do this. It would have been convenient to use the
GCC function "common_type", but this looks like its specific to the
C and C++ front-ends.
Ok for mainline CVS?
2002-05-17 Roger Sayle <roger@eyesopen.com>
* fold-const.c (fold_truthop): Transform "a || b" into "(a|b) != 0"
and "!p && !q" into "(p|q) == 0" under suitable conditions.
Index: fold-const.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/fold-const.c,v
retrieving revision 1.202
diff -c -3 -p -r1.202 fold-const.c
*** fold-const.c 15 May 2002 20:39:55 -0000 1.202
--- fold-const.c 17 May 2002 21:24:38 -0000
*************** fold_truthop (code, truth_type, lhs, rhs
*** 3505,3511 ****
&& ! FLOAT_TYPE_P (TREE_TYPE (rl_arg))
&& simple_operand_p (rl_arg)
&& simple_operand_p (rr_arg))
! return build (code, truth_type, lhs, rhs);
/* See if the comparisons can be merged. Then get all the parameters for
each side. */
--- 3505,3533 ----
&& ! FLOAT_TYPE_P (TREE_TYPE (rl_arg))
&& simple_operand_p (rl_arg)
&& simple_operand_p (rr_arg))
! {
! /* Convert (a != 0) || (b != 0) into (a | b) != 0. */
! if (code == TRUTH_OR_EXPR
! && lcode == NE_EXPR && integer_zerop (lr_arg)
! && rcode == NE_EXPR && integer_zerop (rr_arg)
! && TREE_TYPE (ll_arg) == TREE_TYPE (rl_arg))
! return build (NE_EXPR, truth_type,
! build (BIT_IOR_EXPR, TREE_TYPE (ll_arg),
! ll_arg, rl_arg),
! integer_zero_node);
!
! /* Convert (a == 0) && (b == 0) into (a | b) == 0. */
! if (code == TRUTH_AND_EXPR
! && lcode == EQ_EXPR && integer_zerop (lr_arg)
! && rcode == EQ_EXPR && integer_zerop (rr_arg)
! && TREE_TYPE (ll_arg) == TREE_TYPE (rl_arg))
! return build (EQ_EXPR, truth_type,
! build (BIT_IOR_EXPR, TREE_TYPE (ll_arg),
! ll_arg, rl_arg),
! integer_zero_node);
!
! return build (code, truth_type, lhs, rhs);
! }
/* See if the comparisons can be merged. Then get all the parameters for
each side. */
Roger
--
Roger Sayle, E-mail: roger@eyesopen.com
OpenEye Scientific Software, WWW: http://www.eyesopen.com/
Suite 1107, 3600 Cerrillos Road, Tel: (+1) 505-473-7385
Santa Fe, New Mexico, 87507. Fax: (+1) 505-473-0833
More information about the Gcc-patches
mailing list