Created attachment 23796 [details] testcase As mentioned in pr47373, sometimes gcc generates absolute address in jump table, double the size of the table. Now I extract the test case. Compile it with trunk gcc and options -march=armv7-a -mthumb -Os, I can get ... ldr r3, [fp, #0] subs r3, r3, #11 .L14: cmp r3, #18 bhi .L14 adr r0, .L21 ldr pc, [r0, r3, lsl #2] .align 2 .L21: .word .L15+1 .word .L14+1 .word .L16+1 .word .L14+1 .word .L14+1 .word .L17+1 .word .L14+1 .word .L14+1 .word .L14+1 .word .L18+1 .word .L14+1 .word .L14+1 .word .L14+1 .word .L14+1 .word .L14+1 .word .L14+1 .word .L14+1 .word .L19+1 .word .L76+1 .L15: ... This is the first problem, the relative address now becomes absolute address, of course 32bit entries. The corresponding insns from infback.c.220r.nothrow is actually addr_diff_vec, I couldn't find how the absolute addresses are outputted. (jump_insn:TI 85 83 86 7 (parallel [ (set (pc) (if_then_else (leu (reg:SI 3 r3 [551]) (const_int 18 [0x12])) (mem:SI (plus:SI (mult:SI (reg:SI 3 r3 [551]) (const_int 4 [0x4])) (label_ref 86)) [0 S4 A32]) (label_ref:SI 82))) (clobber (reg:CC 24 cc)) (clobber (reg:SI 0 r0)) (use (label_ref 86)) ]) src/zlib/infback.c:281 717 {thumb2_casesi_internal} (expr_list:REG_UNUSED (reg:CC 24 cc) (expr_list:REG_UNUSED (reg:SI 0 r0) (insn_list:REG_LABEL_TARGET 82 (nil)))) -> 86) (code_label 86 85 87 21 "" [2 uses]) (jump_insn 87 86 88 (addr_diff_vec:SI (label_ref:SI 86) [ (label_ref:SI 89) (label_ref:SI 82) (label_ref:SI 180) (label_ref:SI 82) (label_ref:SI 82) (label_ref:SI 232) (label_ref:SI 82) (label_ref:SI 82) (label_ref:SI 82) (label_ref:SI 484) (label_ref:SI 82) (label_ref:SI 82) (label_ref:SI 82) (label_ref:SI 82) (label_ref:SI 82) (label_ref:SI 82) (label_ref:SI 82) (label_ref:SI 700) (label_ref:SI 762) ] (label_ref:SI 82) (label_ref:SI 762)) src/zlib/infback.c:281 -1 (nil)) When I add -fpic to command line, gcc generates following subs r3, r3, #11 .L14: cmp r3, #18 bhi .L14 adr r0, .L21 ldr r1, [r0, r3, lsl #2] add r0, r0, r1 bx r0 .align 2 .L21: .word .L15+1-.L21 .word .L14+1-.L21 .word .L16+1-.L21 .word .L14+1-.L21 .word .L14+1-.L21 .word .L17+1-.L21 .word .L14+1-.L21 .word .L14+1-.L21 .word .L14+1-.L21 .word .L18+1-.L21 .word .L14+1-.L21 .word .L14+1-.L21 .word .L14+1-.L21 .word .L14+1-.L21 .word .L14+1-.L21 .word .L14+1-.L21 .word .L14+1-.L21 .word .L19+1-.L21 .word .L76+1-.L21 .L15: Now we get relative address table, but the table entries are 4 bytes, not the optimal 2 bytes form. This is the second problem. The related source should be in arm.h #define CASE_VECTOR_SHORTEN_MODE(min, max, body) \ (TARGET_THUMB1 \ ? (min >= 0 && max < 512 \ ? (ADDR_DIFF_VEC_FLAGS (body).offset_unsigned = 1, QImode) \ : min >= -256 && max < 256 \ ? (ADDR_DIFF_VEC_FLAGS (body).offset_unsigned = 0, QImode) \ : min >= 0 && max < 8192 \ ? (ADDR_DIFF_VEC_FLAGS (body).offset_unsigned = 1, HImode) \ : min >= -4096 && max < 4096 \ ? (ADDR_DIFF_VEC_FLAGS (body).offset_unsigned = 0, HImode) \ : SImode) \ : ((min < 0 || max >= 0x2000 || !TARGET_THUMB2) ? SImode \ : (max >= 0x200) ? HImode \ : QImode)) Problems: a) Is (max >= 0x2000) correct? Why not (max >= 0x20000)? The maximum unsigned short is 0xFFFF. b) Alghough tbb/tbh needs forward jump (min >= 0), but tbb/tbh isn't must be used. In this case (min < 0), we can use separate instructions to load the offset and add it to pc. It is still a win compared with wider table entry in nearly all cases.
Another possible enhancement is we can also use HImode jump table entries. Similar to cases min<0, although tbh is not available in arm mode, we can use separate instruction to load offset and adjust PC.
Author: ramana Date: Fri Aug 12 16:58:09 2011 New Revision: 177705 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=177705 Log: Fix PR target/48328 part 1 Modified: trunk/gcc/ChangeLog trunk/gcc/config/arm/arm.h