Bug 47373

Summary: avoid goto table to reduce code size when optimized for size
Product: gcc Reporter: Carrot <carrot>
Component: middle-endAssignee: Not yet assigned to anyone <unassigned>
Status: NEW ---    
Severity: enhancement CC: ramana
Priority: P3 Keywords: missed-optimization
Version: 4.6.0   
Target Milestone: ---   
Host: linux Target: arm-linux-androideabi
Build: Known to work:
Known to fail: Last reconfirmed: 2011-02-01 01:19:27
Attachments: modified testcase

Description Carrot 2011-01-20 08:53:52 UTC
Created attachment 23040 [details]
modified testcase

When I compiled the infback.c from zlib 1.2.5 with options -march=armv7-a -mthumb -Os, gcc 4.6 generates following code for a large switch statement:

	subs	r3, r3, #11
	cmp	r3, #18
	bhi	.L16
	tbh	[pc, r3, lsl #1]
.L23:
	.2byte	(.L17-.L23)/2
	.2byte	(.L16-.L23)/2
	.2byte	(.L18-.L23)/2
	.2byte	(.L16-.L23)/2
	.2byte	(.L16-.L23)/2
	.2byte	(.L154-.L23)/2
	.2byte	(.L16-.L23)/2
	.2byte	(.L16-.L23)/2
	.2byte	(.L16-.L23)/2
	.2byte	(.L20-.L23)/2
	.2byte	(.L16-.L23)/2
	.2byte	(.L16-.L23)/2
	.2byte	(.L16-.L23)/2
	.2byte	(.L16-.L23)/2
	.2byte	(.L16-.L23)/2
	.2byte	(.L16-.L23)/2
	.2byte	(.L16-.L23)/2
	.2byte	(.L21-.L23)/2
	.2byte	(.L121-.L23)/2
.L17:

GCC generates a goto table for 19 cases. The table and the instructions which manipulate it occupies 19*2 + 10 = 48 bytes.

Actually most of the targets in the table are same. There are only 6 targets other than .L16. So if we generate a sequence of cmp & br instructions, we need only 6 cmp&br and one br to default, that's only 4*6+2=26 bytes.

When I randomly modified the source code, gcc sometimes generate the absolute address in the goto table, double the table size, make result worse. The modified source code is attached.
Comment 1 Ramana Radhakrishnan 2011-02-01 01:19:27 UTC
This is partly rtl-optimizers around the way in which we expand switch tables and partly in the target.