Flag for handling inlining of strcmp/memcmp on i386
Martin Thuresson
martint@google.com
Tue Sep 29 20:50:00 GMT 2009
Gcc currently inlines memcmp and strcmp to repz cmpsb during
optimization. Since the library call has optimizations, such as
reading full, aligned words, it turns out that byte-by-byte
comparison is usually slower than calling the library functions.
The diagrams show performance numbers for the library
call and the inlined version. The numbers are from a
microbenchmark that compare buffers, (both equal and not equal
buffers), of various lengths.
http://www.ce.chalmers.se/~martin/foo/amd_opteron_call_repz.png
http://www.ce.chalmers.se/~martin/foo/intel_core_call_repz.png
The performance impact can be large for programs handling large
strings that are expected to be equal, though I did not see any
performance change on Spec2006 (less than 1% difference).
This patch introduces the flag -minline-compares that controls
the inlining.
Thanks,
Martin
-------------- next part --------------
Index: doc/invoke.texi
===================================================================
--- doc/invoke.texi (revision 152256)
+++ doc/invoke.texi (working copy)
@@ -594,7 +594,8 @@ Objective-C and Objective-C++ Dialects}.
-maes -mpclmul @gol
-msse4a -m3dnow -mpopcnt -mabm @gol
-mthreads -mno-align-stringops -minline-all-stringops @gol
--minline-stringops-dynamically -mstringop-strategy=@var{alg} @gol
+-minline-compares -minline-stringops-dynamically @gol
+-mstringop-strategy=@var{alg} @gol
-mpush-args -maccumulate-outgoing-args -m128bit-long-double @gol
-m96bit-long-double -mregparm=@var{num} -msseregparm @gol
-mveclibabi=@var{type} -mpc32 -mpc64 -mpc80 -mstackrealign @gol
@@ -11886,6 +11887,12 @@ aligned at least to 4 byte boundary. Th
size, but may improve performance of code that depends on fast memcpy, strlen
and memset for short lengths.
+@item -minline-compares
+@opindex minline-compares
+This option enables GCC to inline calls to memcmp and strcmp. The
+inlined version does a byte-by-byte comparion using a repeat string
+operation prefix.
+
@item -minline-stringops-dynamically
@opindex minline-stringops-dynamically
For string operation of unknown size, inline runtime checks so for small
Index: testsuite/gcc.dg/20050503-1.c
===================================================================
--- testsuite/gcc.dg/20050503-1.c (revision 152256)
+++ testsuite/gcc.dg/20050503-1.c (working copy)
@@ -3,7 +3,8 @@
expanders. */
/* { dg-do compile } */
/* { dg-skip-if "" { { i?86-*-* x86_64-*-* } && { ilp32 && { ! nonpic } } } { "*" } { "" } } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O2 -minline-compares" { target { i?86-*-* || x86_64-*-* } } } */
+/* { dg-options "-O2" { target {! { i?86-*-* || x86_64-*-* } } } } */
typedef __SIZE_TYPE__ size_t;
Index: config/i386/i386.md
===================================================================
--- config/i386/i386.md (revision 152256)
+++ config/i386/i386.md (working copy)
@@ -20033,6 +20033,9 @@ (define_expand "cmpstrnsi"
{
rtx addr1, addr2, out, outlow, count, countreg, align;
+ if (!TARGET_INLINE_COMPARES)
+ FAIL;
+
if (optimize_insn_for_size_p () && !TARGET_INLINE_ALL_STRINGOPS)
FAIL;
Index: config/i386/i386.opt
===================================================================
--- config/i386/i386.opt (revision 152256)
+++ config/i386/i386.opt (working copy)
@@ -140,6 +140,10 @@ minline-all-stringops
Target Report Mask(INLINE_ALL_STRINGOPS) Save
Inline all known string operations
+minline-compares
+Target Report Mask(INLINE_COMPARES) Save
+Inline compare operations strcmp and memcmp
+
minline-stringops-dynamically
Target Report Mask(INLINE_STRINGOPS_DYNAMICALLY) Save
Inline memset/memcpy string operations, but perform inline version only for small blocks
Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c (revision 152256)
+++ config/i386/i386.c (working copy)
@@ -2393,6 +2393,7 @@ ix86_target_string (int isa, int flags,
{ "-mfp-ret-in-387", MASK_FLOAT_RETURNS },
{ "-mieee-fp", MASK_IEEE_FP },
{ "-minline-all-stringops", MASK_INLINE_ALL_STRINGOPS },
+ { "-minline-compares", MASK_INLINE_COMPARES },
{ "-minline-stringops-dynamically", MASK_INLINE_STRINGOPS_DYNAMICALLY },
{ "-mms-bitfields", MASK_MS_BITFIELD_LAYOUT },
{ "-mno-align-stringops", MASK_NO_ALIGN_STRINGOPS },
@@ -3642,6 +3643,10 @@ ix86_valid_target_attribute_inner_p (tre
OPT_minline_all_stringops,
MASK_INLINE_ALL_STRINGOPS),
+ IX86_ATTR_YES ("inline-compares",
+ OPT_minline_compares,
+ MASK_INLINE_COMPARES),
+
IX86_ATTR_YES ("inline-stringops-dynamically",
OPT_minline_stringops_dynamically,
MASK_INLINE_STRINGOPS_DYNAMICALLY),
-------------- next part --------------
2009-09-29 Martin Thuresson <martint@google.com>
* config/i386/i386.c (ix86_target_string)
(ix86_valid_target_attribute_inner_p): Add minline-compares support.
* config/i386/i386.md (cmpstrnsi): Update conditional.
* config/i386/i386.opt (minline-compares): Add.
* docs/invoke.texi (minline-compares): Document.
2009-09-29 Martin Thuresson <martint@google.com>
* gcc.dg/20050503-1.c: Adjust to use minline-compares.
More information about the Gcc-patches
mailing list