This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Prefer mempcpy to memcpy on x86_64 target (PR middle-end/81657).
- From: Martin Liška <mliska at suse dot cz>
- To: Jakub Jelinek <jakub at redhat dot com>
- Cc: Richard Biener <rguenther at suse dot de>, Uros Bizjak <ubizjak at gmail dot com>, gcc-patches at gcc dot gnu dot org, Marc Glisse <marc dot glisse at inria dot fr>, "H.J. Lu" <hjl dot tools at gmail dot com>
- Date: Tue, 10 Apr 2018 14:27:59 +0200
- Subject: Re: [PATCH] Prefer mempcpy to memcpy on x86_64 target (PR middle-end/81657).
- References: <4ca9c192-84f2-95ba-ffd7-1c9aa9be1dfd@suse.cz> <20180321103425.GJ8577@tucnak> <aeef50db-3550-08b3-22c4-4067696abe08@suse.cz> <20180328143114.GK8577@tucnak> <adc4fa95-1f8e-67ae-ffeb-81c1f239674b@suse.cz> <20180328163652.GL8577@tucnak> <772b1171-2321-67d9-85e7-358a5cad0efa@suse.cz> <20180329122532.GP8577@tucnak> <17bbc039-e511-4fbe-d534-3d6d21aadc00@suse.cz> <2d812eaf-8ea0-68e8-089b-0c3d89a203d8@suse.cz> <20180410091915.GA8577@tucnak>
On 04/10/2018 11:19 AM, Jakub Jelinek wrote:
> On Mon, Apr 09, 2018 at 02:31:04PM +0200, Martin Liška wrote:
>> gcc/testsuite/ChangeLog:
>>
>> 2018-03-28 Martin Liska <mliska@suse.cz>
>>
>> * gcc.dg/string-opt-1.c:
>
> I guess you really didn't mean to keep the above entry around, just the one
> below, right?
Sure, fixed.
>
>> gcc/testsuite/ChangeLog:
>>
>> 2018-03-14 Martin Liska <mliska@suse.cz>
>>
>> * gcc.dg/string-opt-1.c: Adjust scans for i386 and glibc target
>> and others.
>
>> --- a/gcc/config.gcc
>> +++ b/gcc/config.gcc
>> @@ -1607,6 +1607,7 @@ x86_64-*-linux* | x86_64-*-kfreebsd*-gnu)
>> x86_64-*-linux*)
>> tm_file="${tm_file} linux.h linux-android.h i386/linux-common.h i386/linux64.h"
>> extra_options="${extra_options} linux-android.opt"
>> + extra_objs="${extra_objs} x86-linux.o"
>> ;;
>
> The should go into the i[34567]86-*-linux*) case too (outside of the
> if test x$enable_targets = xall; then conditional).
> Or maybe better, remove the above and do it in:
> i[34567]86-*-linux* | x86_64-*-linux*)
> extra_objs="${extra_objs} cet.o"
> tmake_file="$tmake_file i386/t-linux i386/t-cet"
> ;;
> spot, just add x86-linux.o next to cet.o.
Done.
>
>> --- a/gcc/config/i386/linux.h
>> +++ b/gcc/config/i386/linux.h
>> @@ -24,3 +24,5 @@ along with GCC; see the file COPYING3. If not see
>>
>> #undef MUSL_DYNAMIC_LINKER
>> #define MUSL_DYNAMIC_LINKER "/lib/ld-musl-i386.so.1"
>> +
>> +#define SUBTARGET_LIBC_FUNC_SPEED ix86_linux_libc_func_speed
>> diff --git a/gcc/config/i386/linux64.h b/gcc/config/i386/linux64.h
>> index f2d913e30ac..d855f5cc239 100644
>> --- a/gcc/config/i386/linux64.h
>> +++ b/gcc/config/i386/linux64.h
>> @@ -37,3 +37,5 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
>> #define MUSL_DYNAMIC_LINKER64 "/lib/ld-musl-x86_64.so.1"
>> #undef MUSL_DYNAMIC_LINKERX32
>> #define MUSL_DYNAMIC_LINKERX32 "/lib/ld-musl-x32.so.1"
>> +
>> +#define SUBTARGET_LIBC_FUNC_SPEED ix86_linux_libc_func_speed
>
> And the above two changes should be replaced by a change in
> gcc/config/i386/linux-common.h.
Likewise.
>
>> +#include "coretypes.h"
>> +#include "cp/cp-tree.h" /* This is why we're a separate module. */
>
> Why do you need cp/cp-tree.h? That is just too weird.
> The function just uses libc_speed (in core-types.h, built_in_function
> (likewise), OPTION_GLIBC (config/linux.h).
I ended up with minimal set of includes:
#include "config.h"
#include "system.h"
#include "coretypes.h"
#include "backend.h"
#include "tree.h"
I'm retesting the patch.
Martin
>
> Jakub
>
>From bed35715063f9435b697eaf4c9868f81e8556de8 Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Wed, 14 Mar 2018 09:44:18 +0100
Subject: [PATCH] Introduce new libc_func_speed target hook (PR
middle-end/81657).
gcc/ChangeLog:
2018-03-14 Martin Liska <mliska@suse.cz>
PR middle-end/81657
* builtins.c (expand_builtin_memory_copy_args): Handle situation
when libc library provides a fast mempcpy implementation/
* config/linux-protos.h (ix86_linux_libc_func_speed): New.
(TARGET_LIBC_FUNC_SPEED): Likewise.
* config/i386/linux-common.h (SUBTARGET_LIBC_FUNC_SPEED): Define
macro.
* config/i386/t-linux: Add x86-linux.o.
* config.gcc: Likewise.
* config/i386/x86-linux.c: New file.
* coretypes.h (enum libc_speed): Likewise.
* doc/tm.texi: Document new target hook.
* doc/tm.texi.in: Likewise.
* expr.c (emit_block_move_hints): Handle libc bail out argument.
* expr.h (emit_block_move_hints): Add new parameters.
* target.def: Add new hook.
* targhooks.c (enum libc_speed): New enum.
(default_libc_func_speed): Provide a default hook
implementation.
* targhooks.h (default_libc_func_speed): Likewise.
gcc/testsuite/ChangeLog:
2018-03-14 Martin Liska <mliska@suse.cz>
* gcc.dg/string-opt-1.c: Adjust scans for i386 and glibc target
and others.
---
gcc/builtins.c | 15 ++++++++++-
gcc/config.gcc | 2 +-
gcc/config/i386/i386.c | 5 ++++
gcc/config/i386/linux-common.h | 2 ++
gcc/config/i386/t-linux | 6 +++++
gcc/config/i386/x86-linux.c | 52 +++++++++++++++++++++++++++++++++++++
gcc/config/linux-protos.h | 1 +
gcc/coretypes.h | 7 +++++
gcc/doc/tm.texi | 4 +++
gcc/doc/tm.texi.in | 1 +
gcc/expr.c | 11 +++++++-
gcc/expr.h | 3 ++-
gcc/target.def | 7 +++++
gcc/targhooks.c | 9 +++++++
gcc/targhooks.h | 1 +
gcc/testsuite/gcc.dg/string-opt-1.c | 5 ++--
16 files changed, 125 insertions(+), 6 deletions(-)
create mode 100644 gcc/config/i386/x86-linux.c
diff --git a/gcc/builtins.c b/gcc/builtins.c
index 487d9d58db2..98ee3fb272d 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -3651,13 +3651,26 @@ expand_builtin_memory_copy_args (tree dest, tree src, tree len,
src_mem = get_memory_rtx (src, len);
set_mem_align (src_mem, src_align);
+ /* emit_block_move_hints can generate a library call to memcpy function.
+ In situations when a libc library provides fast implementation
+ of mempcpy, then it's better to call mempcpy directly. */
+ bool avoid_libcall
+ = (endp == 1
+ && targetm.libc_func_speed ((int)BUILT_IN_MEMPCPY) == LIBC_FAST_SPEED
+ && target != const0_rtx);
+
/* Copy word part most expediently. */
+ bool libcall_avoided = false;
dest_addr = emit_block_move_hints (dest_mem, src_mem, len_rtx,
CALL_EXPR_TAILCALL (exp)
&& (endp == 0 || target == const0_rtx)
? BLOCK_OP_TAILCALL : BLOCK_OP_NORMAL,
expected_align, expected_size,
- min_size, max_size, probable_max_size);
+ min_size, max_size, probable_max_size,
+ avoid_libcall ? &libcall_avoided : NULL);
+
+ if (libcall_avoided)
+ return NULL_RTX;
if (dest_addr == 0)
{
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 1b58c060a92..7fe43856b6a 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -4617,7 +4617,7 @@ case ${target} in
i[34567]86-*-darwin* | x86_64-*-darwin*)
;;
i[34567]86-*-linux* | x86_64-*-linux*)
- extra_objs="${extra_objs} cet.o"
+ extra_objs="${extra_objs} cet.o x86-linux.o"
tmake_file="$tmake_file i386/t-linux i386/t-cet"
;;
i[34567]86-*-kfreebsd*-gnu | x86_64-*-kfreebsd*-gnu)
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index b4f6aec1434..2471ff7b99a 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -52105,6 +52105,11 @@ ix86_run_selftests (void)
#undef TARGET_WARN_PARAMETER_PASSING_ABI
#define TARGET_WARN_PARAMETER_PASSING_ABI ix86_warn_parameter_passing_abi
+#ifdef SUBTARGET_LIBC_FUNC_SPEED
+#undef TARGET_LIBC_FUNC_SPEED
+#define TARGET_LIBC_FUNC_SPEED SUBTARGET_LIBC_FUNC_SPEED
+#endif
+
#if CHECKING_P
#undef TARGET_RUN_TARGET_SELFTESTS
#define TARGET_RUN_TARGET_SELFTESTS selftest::ix86_run_selftests
diff --git a/gcc/config/i386/linux-common.h b/gcc/config/i386/linux-common.h
index d877387021b..1b48c15e5c0 100644
--- a/gcc/config/i386/linux-common.h
+++ b/gcc/config/i386/linux-common.h
@@ -126,3 +126,5 @@ extern void file_end_indicate_exec_stack_and_cet (void);
#undef TARGET_ASM_FILE_END
#define TARGET_ASM_FILE_END file_end_indicate_exec_stack_and_cet
+
+#define SUBTARGET_LIBC_FUNC_SPEED ix86_linux_libc_func_speed
diff --git a/gcc/config/i386/t-linux b/gcc/config/i386/t-linux
index 155314c08a7..6e3ebe94fe8 100644
--- a/gcc/config/i386/t-linux
+++ b/gcc/config/i386/t-linux
@@ -1 +1,7 @@
MULTIARCH_DIRNAME = $(call if_multiarch,i386-linux-gnu)
+
+x86-linux.o: $(srcdir)/config/i386/x86-linux.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
+ $(TM_H) $(RTL_H) $(REGS_H) hard-reg-set.h output.h $(TREE_H) flags.h \
+ $(TM_P_H) $(HASHTAB_H) $(GGC_H)
+ $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \
+ $(srcdir)/config/i386/x86-linux.c
diff --git a/gcc/config/i386/x86-linux.c b/gcc/config/i386/x86-linux.c
new file mode 100644
index 00000000000..5e4331f635a
--- /dev/null
+++ b/gcc/config/i386/x86-linux.c
@@ -0,0 +1,52 @@
+/* Implementation for linux-specific functions for i386 and x86-64 systems.
+ Copyright (C) 2018 Free Software Foundation, Inc.
+ Contributed by Martin Liska <mliska@suse.cz>.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+GNU General Public License for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
+<http://www.gnu.org/licenses/>. */
+
+#define IN_TARGET_CODE 1
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "tree.h"
+
+/* This hook determines whether a function from libc has a fast implementation
+ FN is present at the runtime. We override it for i386 and glibc C library
+ as this combination provides fast implementation of mempcpy function. */
+
+enum libc_speed
+ix86_linux_libc_func_speed (int fn)
+{
+ enum built_in_function f = (built_in_function)fn;
+
+ if (!OPTION_GLIBC)
+ return LIBC_UNKNOWN_SPEED;
+
+ switch (f)
+ {
+ case BUILT_IN_MEMPCPY:
+ return LIBC_FAST_SPEED;
+ default:
+ return LIBC_UNKNOWN_SPEED;
+ }
+}
diff --git a/gcc/config/linux-protos.h b/gcc/config/linux-protos.h
index 9da8dd7ecaa..b7284735366 100644
--- a/gcc/config/linux-protos.h
+++ b/gcc/config/linux-protos.h
@@ -20,3 +20,4 @@ along with GCC; see the file COPYING3. If not see
extern bool linux_has_ifunc_p (void);
extern bool linux_libc_has_function (enum function_class fn_class);
+extern enum libc_speed ix86_linux_libc_func_speed (int fn);
diff --git a/gcc/coretypes.h b/gcc/coretypes.h
index 283b4eb33fe..fe618f708f4 100644
--- a/gcc/coretypes.h
+++ b/gcc/coretypes.h
@@ -384,6 +384,13 @@ enum excess_precision_type
EXCESS_PRECISION_TYPE_FAST
};
+enum libc_speed
+{
+ LIBC_FAST_SPEED,
+ LIBC_SLOW_SPEED,
+ LIBC_UNKNOWN_SPEED
+};
+
/* Support for user-provided GGC and PCH markers. The first parameter
is a pointer to a pointer, the second a cookie. */
typedef void (*gt_pointer_operator) (void *, void *);
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index bd8b917ba82..0f7c91a22c4 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -5501,6 +5501,10 @@ macro, a reasonable default is used.
This hook determines whether a function from a class of functions
@var{fn_class} is present at the runtime.
@end deftypefn
+@deftypefn {Target Hook} libc_speed TARGET_LIBC_FUNC_SPEED (int @var{fn})
+This hook determines whether a function from libc has a fast implementation
+@var{fn} is present at the runtime.
+@end deftypefn
@defmac NEXT_OBJC_RUNTIME
Set this macro to 1 to use the "NeXT" Objective-C message sending conventions
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index b0207146e8c..4bb2998a8a1 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -3933,6 +3933,7 @@ macro, a reasonable default is used.
@end defmac
@hook TARGET_LIBC_HAS_FUNCTION
+@hook TARGET_LIBC_FUNC_SPEED
@defmac NEXT_OBJC_RUNTIME
Set this macro to 1 to use the "NeXT" Objective-C message sending conventions
diff --git a/gcc/expr.c b/gcc/expr.c
index 00660293f72..f3bd698bc4d 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -1554,6 +1554,8 @@ compare_by_pieces (rtx arg0, rtx arg1, unsigned HOST_WIDE_INT len,
MIN_SIZE is the minimal size of block to move
MAX_SIZE is the maximal size of block to move, if it can not be represented
in unsigned HOST_WIDE_INT, than it is mask of all ones.
+ If BAIL_OUT_LIBCALL is non-null, do not emit library call and assign
+ true to the pointer when move is not done.
Return the address of the new block, if memcpy is called and returns it,
0 otherwise. */
@@ -1563,7 +1565,8 @@ emit_block_move_hints (rtx x, rtx y, rtx size, enum block_op_methods method,
unsigned int expected_align, HOST_WIDE_INT expected_size,
unsigned HOST_WIDE_INT min_size,
unsigned HOST_WIDE_INT max_size,
- unsigned HOST_WIDE_INT probable_max_size)
+ unsigned HOST_WIDE_INT probable_max_size,
+ bool *bail_out_libcall)
{
bool may_use_call;
rtx retval = 0;
@@ -1625,6 +1628,12 @@ emit_block_move_hints (rtx x, rtx y, rtx size, enum block_op_methods method,
&& ADDR_SPACE_GENERIC_P (MEM_ADDR_SPACE (x))
&& ADDR_SPACE_GENERIC_P (MEM_ADDR_SPACE (y)))
{
+ if (bail_out_libcall)
+ {
+ *bail_out_libcall = true;
+ return retval;
+ }
+
/* Since x and y are passed to a libcall, mark the corresponding
tree EXPR as addressable. */
tree y_expr = MEM_EXPR (y);
diff --git a/gcc/expr.h b/gcc/expr.h
index b3d523bcb24..c2bf87fd14e 100644
--- a/gcc/expr.h
+++ b/gcc/expr.h
@@ -110,7 +110,8 @@ extern rtx emit_block_move_hints (rtx, rtx, rtx, enum block_op_methods,
unsigned int, HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
unsigned HOST_WIDE_INT,
- unsigned HOST_WIDE_INT);
+ unsigned HOST_WIDE_INT,
+ bool *bail_out_libcall = NULL);
extern rtx emit_block_cmp_hints (rtx, rtx, rtx, tree, rtx, bool,
by_pieces_constfn, void *);
extern bool emit_storent_insn (rtx to, rtx from);
diff --git a/gcc/target.def b/gcc/target.def
index c5b2a1e7e71..3bbddc82776 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -2639,6 +2639,13 @@ DEFHOOK
bool, (enum function_class fn_class),
default_libc_has_function)
+DEFHOOK
+(libc_func_speed,
+ "This hook determines whether a function from libc has a fast implementation\n\
+@var{fn} is present at the runtime.",
+ libc_speed, (int fn),
+ default_libc_func_speed)
+
/* True if new jumps cannot be created, to replace existing ones or
not, at the current point in the compilation. */
DEFHOOK
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index fafcc6c5196..6e44f6f79cf 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -1642,6 +1642,15 @@ no_c99_libc_has_function (enum function_class fn_class ATTRIBUTE_UNUSED)
return false;
}
+/* This hook determines whether a function from libc has a fast implementation
+ FN is present at the runtime. */
+
+enum libc_speed
+default_libc_func_speed (int)
+{
+ return LIBC_UNKNOWN_SPEED;
+}
+
tree
default_builtin_tm_load_store (tree ARG_UNUSED (type))
{
diff --git a/gcc/targhooks.h b/gcc/targhooks.h
index 8a4393f2ba4..7508673ad0a 100644
--- a/gcc/targhooks.h
+++ b/gcc/targhooks.h
@@ -205,6 +205,7 @@ extern bool default_have_conditional_execution (void);
extern bool default_libc_has_function (enum function_class);
extern bool no_c99_libc_has_function (enum function_class);
extern bool gnu_libc_has_function (enum function_class);
+extern enum libc_speed default_libc_func_speed (int);
extern tree default_builtin_tm_load_store (tree);
diff --git a/gcc/testsuite/gcc.dg/string-opt-1.c b/gcc/testsuite/gcc.dg/string-opt-1.c
index 2f060732bf0..7faaadcbb1f 100644
--- a/gcc/testsuite/gcc.dg/string-opt-1.c
+++ b/gcc/testsuite/gcc.dg/string-opt-1.c
@@ -48,5 +48,6 @@ main (void)
return 0;
}
-/* { dg-final { scan-assembler-not "\<mempcpy\>" } } */
-/* { dg-final { scan-assembler "memcpy" } } */
+/* { dg-final { scan-assembler-not "\<mempcpy\>" { target { ! { i?86-*-gnu* x86_64-*-gnu* i?86-*-linux* x86_64-*-linux* } } } } } */
+/* { dg-final { scan-assembler "memcpy" { target { ! { i?86-*-gnu* x86_64-*-gnu* i?86-*-linux* x86_64-*-linux* } } } } } */
+/* { dg-final { scan-assembler "mempcpy" { target { i?86-*-gnu* x86_64-*-gnu* i?86-*-linux* x86_64-*-linux* } } } } */
--
2.16.3