Bug 14842 - Another way of expanding a switch statement
Summary: Another way of expanding a switch statement
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.0.0
: P2 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2004-04-04 10:00 UTC by Kazu Hirata
Modified: 2012-01-05 23:16 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2012-01-04 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Kazu Hirata 2004-04-04 10:00:43 UTC
Convert foo() to baz() (at expand time or very late in tree optimization):

void bar (void);

void
foo (int a)
{
  switch (a)
    {
    case 5:
    case 7:
      bar ();
    }
}

void
baz (int a)
{
  if ((a & ~2) == 5)
    bar ();
}

On i686-pc-linux-gnu with -O2 -fomit-frame-pointer, I get:

	.text
	.p2align 2,,3
.globl foo
	.type	foo, @function
foo:
	movl	4(%esp), %eax	# 3	*movsi_1/1	[length = 4]
	cmpl	$5, %eax	# 9	*cmpsi_1_insn/1	[length = 3]
	je	.L3	# 10	*jcc_1	[length = 2]
	cmpl	$7, %eax	# 11	*cmpsi_1_insn/1	[length = 3]
	je	.L3	# 12	*jcc_1	[length = 2]
	ret	# 35	return_internal	[length = 1]
	.p2align 2,,3
.L3:
	jmp	bar	# 19	*call_0	[length = 5]
	.size	foo, .-foo
	.p2align 2,,3
.globl baz
	.type	baz, @function
baz:
	movl	4(%esp), %eax	# 29	*movsi_1/1	[length = 4]
	andl	$-3, %eax	# 9	*andsi_1/1	[length = 3]
	cmpl	$5, %eax	# 10	*cmpsi_1_insn/1	[length = 3]
	je	.L8	# 11	*jcc_1	[length = 2]
	ret	# 32	return_internal	[length = 1]
	.p2align 2,,3
.L8:
	jmp	bar	# 16	*call_0	[length = 5]
	.size	baz, .-baz

Comparing baz() to foo(), aside from the small size reduction,
notice that we have eliminated one conditional jump.
(If we need to preserve the value of "a", we don't get
the size reduction anymore because we need to make a copy of "a", but
we can still eliminate one conditional jump.)

Depending on a situation, we can either entirely replace the binary tree
expansion of the switch statement (like this case) or replace a part of
the binary tree.

If we think of this problem as minimizing a circuit recognizing 5 and 7,
we can use Quine-McCluskey minimization to make (a & ~2) == 5,
but a sequential program is different from a circuit, so the result of 
Quine-McCluskey minimization is not always optimal on a sequential machine.
It may not be a bad idea to just specialize for a small case involving only two
cases like this one.
Comment 1 Andrew Pinski 2004-04-04 16:12:35 UTC
Confirmed.
Comment 2 Andrew Pinski 2004-04-04 17:13:02 UTC
PPC it is even more impressive (I think but then again I am a PPC person):
_foo:
        cmpwi cr7,r3,5  ; compare r3 to 5 and put the results in cr7
        cmpwi cr6,r3,7  ; compare r3 to 7 and put the results in cr6
        beq- cr7,L3        ; branch if equal (cr7) to L3
        bnelr- cr6           ; return if equal (cr6)
L3:
        b L_bar$stub     ; sib called bar
        .align 2
        .globl _baz
_baz:
        rlwinm r3,r3,0,31,29  ; r3 = r3 & ~2
        cmpwi cr7,r3,5 ; compare word r3 to 5
        bnelr+ cr7  ; return if not equal
        b L_bar$stub ; sib called bar

Comment 3 Andrew Pinski 2012-01-05 23:02:34 UTC
We should also optimize:
void
bz (int a)
{
  if (a==5||a==7)
    bar();
}

Using the &~2 trick also.
Comment 4 Jakub Jelinek 2012-01-05 23:16:47 UTC
We already do, see PR46309.