11823 – Optimizing large jump tables for switch statements

Bug 11823 - Optimizing large jump tables for switch statements

Summary: Optimizing large jump tables for switch statements

Status:	RESOLVED FIXED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	middle-end (show other bugs)
Version:	3.4.0

Importance:	P3 enhancement
Target Milestone:	3.4.0
Assignee:	Not yet assigned to anyone

URL:
Keywords:	missed-optimization

Depends on:
Blocks:

Reported:	2003-08-06 08:39 UTC by Gábor Lóki
Modified:	2003-10-17 17:47 UTC (History)
CC List:	1 user (show)

See Also:
Host:
Target:	arm-unknown-elf
Build:
Known to work:
Known to fail:
Last reconfirmed:	2003-08-06 20:03:59

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Gábor Lóki 2003-08-06 08:40:00 UTC

In some cases GCC generates very large jump tables, e.g. the table can contain
too few different case values wrt. the table size. In these cases when
optimizing for size it would be preferable to use some other mechanism for the
jumping problem. For example, in some cases it would be more effecient to
create simple compare and branch pairs for checking the case values.

--- c example ---
// arm-elf-gcc -S -g0 -Os -o large-jump.s large-jump.c
int i;
int func(int a)
{
  switch(a)
  {
    case 0:   i = 20; break;
    case 1:   i = 50; break;
    case 2:   i = 29; break;
    case 10:  i = 20; break;
    case 11:  i = 50; break;
    case 12:  i = 29; break;
    case 20:  i = 20; break;
    case 21:  i = 50; break;
    case 22:  i = 29; break;
    case 52:  i = 79; break;
    case 110: i = 27; break;
    default:  i = 77; break;
  }
  return i;
}

--- arm code ---
func:
	ldr	r2, .L17
	cmp	r0, #110
	ldrls	pc, [pc, r0, asl #2]
	b	.L14
	.p2align 2
.L15:
	.word	.L9
	.word	.L10
	.word	.L11
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L9
	.word	.L10
	.word	.L11
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L9
	.word	.L10
	.word	.L11
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L12
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L14
	.word	.L13
.L9:
	mov	r3, #20
	b	.L16
.L10:
	mov	r3, #50
	b	.L16
.L11:
	mov	r3, #29
	b	.L16
.L12:
	mov	r3, #79
	b	.L16
.L13:
	mov	r3, #27
	b	.L16
.L14:
	mov	r3, #77
.L16:
	str	r3, [r2, #0]
	ldr	r3, .L17
	ldr	r0, [r3, #0]
	mov	pc, lr

Comment 1 Andrew Pinski 2003-08-06 20:03:59 UTC

I agruee with this and maybe this one can be done with GCC's current infrostructor but 
who knows.

Comment 2 Steven Bosscher 2003-08-07 13:01:03 UTC

You should look at stmt.c:expand_end_case_type(), under the comment:

      /* If range of values is much bigger than number of values,
         make a sequence of conditional branches instead of a dispatch.
         If the switch-index is a constant, do it this way
         because we can optimize it.  */

The heuristic following this comment now does:

               || compare_tree_int (range, 10 * count) > 0

where "count" is the number of case labels, and range is the range of the case
labels.  In your example, count==11, and range==110, so you just hit the case
where it still gets expanded with a jump table.  (I suppose you knew this,
judging from your example ;-)

So why not replace this heuristic with something like:

               || compare_tree_int (range, (optimize_size ? 3 : 10) * count) > 0

and see if it helps...

Comment 3 Gábor Lóki 2003-09-05 06:47:35 UTC

The problem has been fixed.
See the following mail and cvs diff.
http://gcc.gnu.org/ml/gcc-patches/2003-08/msg02054.html
http://sources.redhat.com/cgi-bin/cvsweb.cgi/gcc/gcc/stmt.c.diff?r1=1.326&r2=1.327&cvsroot=gcc