This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
icache invalidation patch applie to sh port
- From: Joern RENNECKE <joern dot rennecke at st dot com>
- To: GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Wed, 29 Nov 2006 14:42:34 +0000
- Subject: icache invalidation patch applie to sh port
Inlining the icache invalidation is usually more trouble than it's worth
- it makes the code
dependent on the exact mechanism used working on the target
microarchitecture.
multilibbing a large library because one or two infrequently executed
icache invalidations
wastes a lot of build time effort an disk space that coulf be empolyed
more profitably
for actual optimizations. With the icahce invalidation out-of-line, it
is merely a matter
of picking the right ic_invalidate_array library at link time.
regression tested on i686-pc-linux-gnu X sh-elf.
2006-11-27 J"orn Rennecke <joern.rennecke@st.com>
* sh.opt (minline-ic_invalidate): New option.
(musermode): Adjust comment.
* sh.c (sh_initialize_trampoline): Emit library call unless
is set; if it is set, don't emit library call if we can use icbi
instead.
* sh.md (ic_invalidate_line, ic_invalidate_line_sh4a): Also use
icbi for TARGET_SH4_300.
* t-sh (LIB1ASMFUNCS_CACHE): Set.
* doc/invoke.texi: Document -minline-ic_invalidate; Update
-musermode documentation.
--- config/sh/sh.opt@@/main/GCC-4.1.0-int/renneckej-ic_invalid/0 2006-11-06 17:30:56.000000000 +0000
+++ config/sh/sh.opt 2006-11-27 22:19:11.000000000 +0000
@@ -269,6 +269,10 @@ mindexed-addressing
Target Report Mask(ALLOW_INDEXED_ADDRESS) Condition(SUPPORT_ANY_SH5_32MEDIA)
Enable the use of the indexed addressing mode for SHmedia32/SHcompact
+minline-ic_invalidate
+Target Report Var(TARGET_INLINE_IC_INVALIDATE)
+inline code to invalidate instruction cache entries after setting up nested function trampolines
+
minvalid-symbols
Target Report Mask(INVALID_SYMBOLS) Condition(SUPPORT_ANY_SH5)
Assume symbols might be invalid
@@ -325,7 +329,7 @@ Cost to assume for a multiply insn
musermode
Target Report RejectNegative Mask(USERMODE)
-Generate library function call to invalidate instruction cache entries after fixing trampoline
+Don't generate privileged-mode only code; implies -mno-inline-ic_invalidate if the inline code would not work in user mode.
;; We might want to enable this by default for TARGET_HARD_SH4, because
;; zero-offset branches have zero latency. Needs some benchmarking.
--- config/sh/sh.c@@/main/GCC-4.1.0-int/renneckej-ic_invalid/0 2006-11-06 17:39:25.000000000 +0000
+++ config/sh/sh.c 2006-11-27 22:19:25.000000000 +0000
@@ -9564,7 +9564,8 @@ sh_initialize_trampoline (rtx tramp, rtx
emit_move_insn (adjust_address (tramp_mem, SImode, 12), fnaddr);
if (TARGET_HARVARD)
{
- if (TARGET_USERMODE)
+ if (!TARGET_INLINE_IC_INVALIDATE
+ || !(TARGET_SH4A_ARCH || TARGET_SH4_300) && TARGET_USERMODE)
emit_library_call (function_symbol (NULL, "__ic_invalidate",
FUNCTION_ORDINARY),
0, VOIDmode, 1, tramp, SImode);
--- config/sh/sh.md@@/main/GCC-4.1.0-int/renneckej-ic_invalid/0 2006-11-06 17:43:10.000000000 +0000
+++ config/sh/sh.md 2006-11-27 22:19:33.000000000 +0000
@@ -5249,7 +5249,7 @@ (define_expand "ic_invalidate_line"
emit_insn (gen_ic_invalidate_line_compact (operands[0], operands[1]));
DONE;
}
- else if (TARGET_SH4A_ARCH)
+ else if (TARGET_SH4A_ARCH || TARGET_SH4_300)
{
emit_insn (gen_ic_invalidate_line_sh4a (operands[0]));
DONE;
@@ -5277,7 +5277,7 @@ (define_insn "ic_invalidate_line_i"
(define_insn "ic_invalidate_line_sh4a"
[(unspec_volatile [(match_operand:SI 0 "register_operand" "r")]
UNSPEC_ICACHE)]
- "TARGET_SH4A_ARCH"
+ "TARGET_SH4A_ARCH || TARGET_SH4_300"
"ocbwb\\t@%0\;synco\;icbi\\t@%0"
[(set_attr "length" "16")
(set_attr "type" "cwb")])
--- config/sh/t-sh@@/main/GCC-4.1.0-int/renneckej-ic_invalid/0 2006-11-06 17:47:11.000000000 +0000
+++ config/sh/t-sh 2006-11-27 22:19:38.000000000 +0000
@@ -7,6 +7,7 @@ LIB1ASMFUNCS = _ashiftrt _ashiftrt_n _as
_movmem_i4 _mulsi3 _sdivsi3 _sdivsi3_i4 _udivsi3 _udivsi3_i4 _set_fpscr \
_div_table _udiv_qrnnd_16 \
$(LIB1ASMFUNCS_CACHE)
+LIB1ASMFUNCS_CACHE = _ic_invalidate _ic_invalidate_array
# We want fine grained libraries, so use the new code to build the
# floating point emulation libraries.
--- doc/invoke.texi@@/main/GCC-4.1.0-int/renneckej-ic_invalid/0 2006-11-27 22:26:07.000000000 +0000
+++ doc/invoke.texi 2006-11-27 23:02:56.000000000 +0000
@@ -682,7 +682,7 @@ See RS/6000 and PowerPC Options.
-m5-compact -m5-compact-nofpu @gol
-mb -ml -mdalign -mrelax @gol
-mbigtable -mfmovd -mhitachi -mrenesas -mno-renesas -mnomacsave @gol
--mieee -misize -mpadstruct -mspace @gol
+-mieee -misize -minline-ic_invalidate -mpadstruct -mspace @gol
-mprefergot -musermode -multcost=@var{number} -mdiv=@var{strategy} @gol
-mdivsi3_libfunc=@var{name} @gol
-madjust-unroll -mindexed-addressing -mgettrcost=@var{number} -mpt-fixed @gol
@@ -11894,6 +11894,19 @@ comparisons of NANs / infinities incurs
floating point comparison, therefore the default is set to
@option{-ffinite-math-only}.
+@item -minline-ic_invalidate
+@opindex minline-ic_invalidate
+Inline code to invalidate instruction cache entries after setting up
+nested function trampolines.
+This option has no effect if -musermode is in effect and the selected
+code generation option (e.g. -m4) does not allow the use of the icbi
+instruction.
+If the selected code generation option does not allow the use of the icbi
+instruction, and -musermode is not in effect, the inlined code will
+manipulate the instruction cache address array directly with an associative
+write. This not only requires privileged mode, but it will also
+fail if the cache line had been mapped via the TLB and has become unmapped.
+
@item -misize
@opindex misize
Dump instruction size and location in the assembly code.
@@ -11914,10 +11927,9 @@ the Global Offset Table instead of the P
@item -musermode
@opindex musermode
-Generate a library function call to invalidate instruction cache
-entries, after fixing up a trampoline. This library function call
-doesn't assume it can write to the whole memory address space. This
-is the default when the target is @code{sh-*-linux*}.
+Don't generate privileged mode only code; implies -mno-inline-ic_invalidate
+if the inlined code would not work in user mode.
+This is the default when the target is @code{sh-*-linux*}.
@item -multcost=@var{number}
@opindex multcost=@var{number}