[Bug tree-optimization/84011] Optimize switch table with run-time relocation
peter at cordes dot ca
gcc-bugzilla@gcc.gnu.org
Tue May 1 13:24:00 GMT 2018
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84011
Peter Cordes <peter at cordes dot ca> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |peter at cordes dot ca
--- Comment #9 from Peter Cordes <peter at cordes dot ca> ---
(In reply to rguenther@suse.de from comment #4)
> An optimization would be to
> add an indirection by, say, only recording the constant offset
> into an "array of strings" in the table, thus effectively
>
> "case1\0case2\0..."[CSWITCH[i]]
>
> which would require only a relocation to access the single string
> constant. But it would prohibit cases of string merging within
> those strings unless we implement that as well for this optimization.
gcc already totally misses optimizations here where one string is a suffix of
another. "mii" could just be a pointer to the 3rd byte of "sgmii", but we
instead duplicate all the characters. That's where major savings are possible
for this function.
> Note this might be profitable unconditionally, not just with -fpie/pic
> as the CSWITCH table would be smaller (dependent on the total
> size of the merged string).
Indeed, I wrote up bug 85585 with ideas for optimizing this. A table of byte
or uint16_t offsets into a static buffer of packed strings looks good for PIC
and for position-dependent.
To avoid any runtime relocations, all you need is the ability to get a static
address into a register (e.g. RIP-relative LEA) and do an indexed load relative
to it, just like using a normal static char[]. Then add the load result to
that address. Runtime relocation is nice to avoid even if you don't *need* to
avoid it.
Also possible is padding each string out to a constant length and calculating
an index into that, removing a level of indirection. (Good when strings are
similar length and/or all short, and there aren't many strings that are
duplicates or suffixes of others.) Again you just need to get a static address
into a register, and add it to 11*enum_value. This is all ADD + LEA (with one
of them being RIP-relative).
More information about the Gcc-bugs
mailing list