This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

icache invalidation patch applie to sh port


Inlining the icache invalidation is usually more trouble than it's worth - it makes the code
dependent on the exact mechanism used working on the target microarchitecture.
multilibbing a large library because one or two infrequently executed icache invalidations
wastes a lot of build time effort an disk space that coulf be empolyed more profitably
for actual optimizations. With the icahce invalidation out-of-line, it is merely a matter
of picking the right ic_invalidate_array library at link time.


regression tested on i686-pc-linux-gnu X sh-elf.
2006-11-27  J"orn Rennecke  <joern.rennecke@st.com>

	* sh.opt (minline-ic_invalidate): New option.
	(musermode): Adjust comment.
	* sh.c (sh_initialize_trampoline): Emit library call unless
	is set; if it is set, don't emit library call if we can use icbi
	instead.
	* sh.md (ic_invalidate_line, ic_invalidate_line_sh4a): Also use
	icbi for TARGET_SH4_300.
	* t-sh (LIB1ASMFUNCS_CACHE): Set.
	* doc/invoke.texi: Document -minline-ic_invalidate; Update
	-musermode documentation.


--- config/sh/sh.opt@@/main/GCC-4.1.0-int/renneckej-ic_invalid/0	2006-11-06 17:30:56.000000000 +0000
+++ config/sh/sh.opt	2006-11-27 22:19:11.000000000 +0000
@@ -269,6 +269,10 @@ mindexed-addressing
 Target Report Mask(ALLOW_INDEXED_ADDRESS) Condition(SUPPORT_ANY_SH5_32MEDIA)
 Enable the use of the indexed addressing mode for SHmedia32/SHcompact
 
+minline-ic_invalidate
+Target Report Var(TARGET_INLINE_IC_INVALIDATE)
+inline code to invalidate instruction cache entries after setting up nested function trampolines
+
 minvalid-symbols
 Target Report Mask(INVALID_SYMBOLS) Condition(SUPPORT_ANY_SH5)
 Assume symbols might be invalid
@@ -325,7 +329,7 @@ Cost to assume for a multiply insn
 
 musermode
 Target Report RejectNegative Mask(USERMODE)
-Generate library function call to invalidate instruction cache entries after fixing trampoline
+Don't generate privileged-mode only code; implies -mno-inline-ic_invalidate if the inline code would not work in user mode.
 
 ;; We might want to enable this by default for TARGET_HARD_SH4, because
 ;; zero-offset branches have zero latency.  Needs some benchmarking.
--- config/sh/sh.c@@/main/GCC-4.1.0-int/renneckej-ic_invalid/0	2006-11-06 17:39:25.000000000 +0000
+++ config/sh/sh.c	2006-11-27 22:19:25.000000000 +0000
@@ -9564,7 +9564,8 @@ sh_initialize_trampoline (rtx tramp, rtx
   emit_move_insn (adjust_address (tramp_mem, SImode, 12), fnaddr);
   if (TARGET_HARVARD)
     {
-      if (TARGET_USERMODE)
+      if (!TARGET_INLINE_IC_INVALIDATE
+	  || !(TARGET_SH4A_ARCH || TARGET_SH4_300) && TARGET_USERMODE)
 	emit_library_call (function_symbol (NULL, "__ic_invalidate",
 					    FUNCTION_ORDINARY),
 			   0, VOIDmode, 1, tramp, SImode);
--- config/sh/sh.md@@/main/GCC-4.1.0-int/renneckej-ic_invalid/0	2006-11-06 17:43:10.000000000 +0000
+++ config/sh/sh.md	2006-11-27 22:19:33.000000000 +0000
@@ -5249,7 +5249,7 @@ (define_expand "ic_invalidate_line"
       emit_insn (gen_ic_invalidate_line_compact (operands[0], operands[1]));
       DONE;
     }
-  else if (TARGET_SH4A_ARCH)
+  else if (TARGET_SH4A_ARCH || TARGET_SH4_300)
     {
       emit_insn (gen_ic_invalidate_line_sh4a (operands[0]));
       DONE;
@@ -5277,7 +5277,7 @@ (define_insn "ic_invalidate_line_i"
 (define_insn "ic_invalidate_line_sh4a"
   [(unspec_volatile [(match_operand:SI 0 "register_operand" "r")]
 		    UNSPEC_ICACHE)]
-  "TARGET_SH4A_ARCH"
+  "TARGET_SH4A_ARCH || TARGET_SH4_300"
   "ocbwb\\t@%0\;synco\;icbi\\t@%0"
   [(set_attr "length" "16")
    (set_attr "type" "cwb")])
--- config/sh/t-sh@@/main/GCC-4.1.0-int/renneckej-ic_invalid/0	2006-11-06 17:47:11.000000000 +0000
+++ config/sh/t-sh	2006-11-27 22:19:38.000000000 +0000
@@ -7,6 +7,7 @@ LIB1ASMFUNCS = _ashiftrt _ashiftrt_n _as
   _movmem_i4 _mulsi3 _sdivsi3 _sdivsi3_i4 _udivsi3 _udivsi3_i4 _set_fpscr \
   _div_table _udiv_qrnnd_16 \
   $(LIB1ASMFUNCS_CACHE)
+LIB1ASMFUNCS_CACHE = _ic_invalidate _ic_invalidate_array
 
 # We want fine grained libraries, so use the new code to build the
 # floating point emulation libraries.
--- doc/invoke.texi@@/main/GCC-4.1.0-int/renneckej-ic_invalid/0	2006-11-27 22:26:07.000000000 +0000
+++ doc/invoke.texi	2006-11-27 23:02:56.000000000 +0000
@@ -682,7 +682,7 @@ See RS/6000 and PowerPC Options.
 -m5-compact  -m5-compact-nofpu @gol
 -mb  -ml  -mdalign  -mrelax @gol
 -mbigtable  -mfmovd  -mhitachi -mrenesas -mno-renesas -mnomacsave @gol
--mieee  -misize  -mpadstruct  -mspace @gol
+-mieee  -misize  -minline-ic_invalidate -mpadstruct  -mspace @gol
 -mprefergot  -musermode -multcost=@var{number} -mdiv=@var{strategy} @gol
 -mdivsi3_libfunc=@var{name}  @gol
 -madjust-unroll -mindexed-addressing -mgettrcost=@var{number} -mpt-fixed @gol
@@ -11894,6 +11894,19 @@ comparisons of NANs / infinities incurs 
 floating point comparison, therefore the default is set to
 @option{-ffinite-math-only}.
 
+@item -minline-ic_invalidate
+@opindex minline-ic_invalidate
+Inline code to invalidate instruction cache entries after setting up
+nested function trampolines.
+This option has no effect if -musermode is in effect and the selected
+code generation option (e.g. -m4) does not allow the use of the icbi
+instruction.
+If the selected code generation option does not allow the use of the icbi
+instruction, and -musermode is not in effect, the inlined code will
+manipulate the instruction cache address array directly with an associative
+write.  This not only requires privileged mode, but it will also
+fail if the cache line had been mapped via the TLB and has become unmapped.
+
 @item -misize
 @opindex misize
 Dump instruction size and location in the assembly code.
@@ -11914,10 +11927,9 @@ the Global Offset Table instead of the P
 
 @item -musermode
 @opindex musermode
-Generate a library function call to invalidate instruction cache
-entries, after fixing up a trampoline.  This library function call
-doesn't assume it can write to the whole memory address space.  This
-is the default when the target is @code{sh-*-linux*}.
+Don't generate privileged mode only code; implies -mno-inline-ic_invalidate
+if the inlined code would not work in user mode.
+This is the default when the target is @code{sh-*-linux*}.
 
 @item -multcost=@var{number}
 @opindex multcost=@var{number}

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]