This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/48328] New: GCC failed to generate 16bit relative jump table
- From: "carrot at google dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Tue, 29 Mar 2011 08:28:11 +0000
- Subject: [Bug target/48328] New: GCC failed to generate 16bit relative jump table
- Auto-submitted: auto-generated
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48328
Summary: GCC failed to generate 16bit relative jump table
Product: gcc
Version: 4.7.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: carrot@google.com
Host: linux
Target: arm-eabi
Build: linux
Created attachment 23796
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23796
testcase
As mentioned in pr47373, sometimes gcc generates absolute address in jump
table, double the size of the table. Now I extract the test case. Compile it
with trunk gcc and options -march=armv7-a -mthumb -Os, I can get
...
ldr r3, [fp, #0]
subs r3, r3, #11
.L14:
cmp r3, #18
bhi .L14
adr r0, .L21
ldr pc, [r0, r3, lsl #2]
.align 2
.L21:
.word .L15+1
.word .L14+1
.word .L16+1
.word .L14+1
.word .L14+1
.word .L17+1
.word .L14+1
.word .L14+1
.word .L14+1
.word .L18+1
.word .L14+1
.word .L14+1
.word .L14+1
.word .L14+1
.word .L14+1
.word .L14+1
.word .L14+1
.word .L19+1
.word .L76+1
.L15:
...
This is the first problem, the relative address now becomes absolute address,
of course 32bit entries.
The corresponding insns from infback.c.220r.nothrow is actually addr_diff_vec,
I couldn't find how the absolute addresses are outputted.
(jump_insn:TI 85 83 86 7 (parallel [
(set (pc)
(if_then_else (leu (reg:SI 3 r3 [551])
(const_int 18 [0x12]))
(mem:SI (plus:SI (mult:SI (reg:SI 3 r3 [551])
(const_int 4 [0x4]))
(label_ref 86)) [0 S4 A32])
(label_ref:SI 82)))
(clobber (reg:CC 24 cc))
(clobber (reg:SI 0 r0))
(use (label_ref 86))
]) src/zlib/infback.c:281 717 {thumb2_casesi_internal}
(expr_list:REG_UNUSED (reg:CC 24 cc)
(expr_list:REG_UNUSED (reg:SI 0 r0)
(insn_list:REG_LABEL_TARGET 82 (nil))))
-> 86)
(code_label 86 85 87 21 "" [2 uses])
(jump_insn 87 86 88 (addr_diff_vec:SI (label_ref:SI 86)
[
(label_ref:SI 89)
(label_ref:SI 82)
(label_ref:SI 180)
(label_ref:SI 82)
(label_ref:SI 82)
(label_ref:SI 232)
(label_ref:SI 82)
(label_ref:SI 82)
(label_ref:SI 82)
(label_ref:SI 484)
(label_ref:SI 82)
(label_ref:SI 82)
(label_ref:SI 82)
(label_ref:SI 82)
(label_ref:SI 82)
(label_ref:SI 82)
(label_ref:SI 82)
(label_ref:SI 700)
(label_ref:SI 762)
]
(label_ref:SI 82)
(label_ref:SI 762)) src/zlib/infback.c:281 -1
(nil))
When I add -fpic to command line, gcc generates following
subs r3, r3, #11
.L14:
cmp r3, #18
bhi .L14
adr r0, .L21
ldr r1, [r0, r3, lsl #2]
add r0, r0, r1
bx r0
.align 2
.L21:
.word .L15+1-.L21
.word .L14+1-.L21
.word .L16+1-.L21
.word .L14+1-.L21
.word .L14+1-.L21
.word .L17+1-.L21
.word .L14+1-.L21
.word .L14+1-.L21
.word .L14+1-.L21
.word .L18+1-.L21
.word .L14+1-.L21
.word .L14+1-.L21
.word .L14+1-.L21
.word .L14+1-.L21
.word .L14+1-.L21
.word .L14+1-.L21
.word .L14+1-.L21
.word .L19+1-.L21
.word .L76+1-.L21
.L15:
Now we get relative address table, but the table entries are 4 bytes, not the
optimal 2 bytes form. This is the second problem.
The related source should be in arm.h
#define CASE_VECTOR_SHORTEN_MODE(min, max, body) \
(TARGET_THUMB1 \
? (min >= 0 && max < 512 \
? (ADDR_DIFF_VEC_FLAGS (body).offset_unsigned = 1, QImode) \
: min >= -256 && max < 256 \
? (ADDR_DIFF_VEC_FLAGS (body).offset_unsigned = 0, QImode) \
: min >= 0 && max < 8192 \
? (ADDR_DIFF_VEC_FLAGS (body).offset_unsigned = 1, HImode) \
: min >= -4096 && max < 4096 \
? (ADDR_DIFF_VEC_FLAGS (body).offset_unsigned = 0, HImode) \
: SImode) \
: ((min < 0 || max >= 0x2000 || !TARGET_THUMB2) ? SImode \
: (max >= 0x200) ? HImode \
: QImode))
Problems:
a) Is (max >= 0x2000) correct? Why not (max >= 0x20000)? The maximum unsigned
short is 0xFFFF.
b) Alghough tbb/tbh needs forward jump (min >= 0), but tbb/tbh isn't must be
used. In this case (min < 0), we can use separate instructions to load the
offset and add it to pc. It is still a win compared with wider table entry in
nearly all cases.