This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[3.4-BIB] i386 string functions tweek
- From: Jan Hubicka <jh at suse dot cz>
- To: gcc-patches at gcc dot gnu dot org, rth at cygnus dot com
- Date: Sat, 30 Nov 2002 20:26:45 +0100
- Subject: [3.4-BIB] i386 string functions tweek
Hi,
we currently emit rep;movsl sequence when we know source is aligned.
On Athlon the library function can do significantly better with
prefetching and/or streming moves.
Sat Nov 30 20:23:57 CET 2002 Jan Hubicka <jh@suse.cz>
* i386.c (x86_rep_movl_optimal): New variable.
(ix86_expand_movstr, ix86_expand_clrstr): Use TARGET_REP_MOVL_OPTIMAL
* i386.h (TARGET_REP_MOVL_OPTIMAL): New macro.
Index: i386.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/i386/i386.c,v
retrieving revision 1.447.2.35
diff -c -3 -p -r1.447.2.35 i386.c
*** i386.c 28 Nov 2002 22:58:21 -0000 1.447.2.35
--- i386.c 30 Nov 2002 19:06:38 -0000
*************** const int x86_sse_partial_regs_for_cvtsd
*** 510,515 ****
--- 510,516 ----
const int x86_sse_typeless_stores = m_ATHLON_K8;
const int x86_sse_load0_by_pxor = m_PPRO | m_PENT4;
const int x86_use_ffreep = m_ATHLON_K8;
+ const int x86_rep_movl_optimal = m_386 | m_PENT | m_PPRO | m_K6;
/* In case the avreage insn count for single function invocation is
lower than this constant, emit fast (but longer) prologue and
*************** ix86_expand_movstr (dst, src, count_exp,
*** 10552,10559 ****
/* In case we don't know anything about the alignment, default to
library version, since it is usually equally fast and result in
! shorter code. */
! if (!TARGET_INLINE_ALL_STRINGOPS && align < UNITS_PER_WORD)
{
end_sequence ();
return 0;
--- 10559,10570 ----
/* In case we don't know anything about the alignment, default to
library version, since it is usually equally fast and result in
! shorter code.
!
! Also emit call when we know that the count is large and call overhead
! will not be important. */
! if (!TARGET_INLINE_ALL_STRINGOPS
! && (align < UNITS_PER_WORD || !TARGET_REP_MOVL_OPTIMAL))
{
end_sequence ();
return 0;
*************** ix86_expand_clrstr (src, count_exp, alig
*** 10767,10774 ****
/* In case we don't know anything about the alignment, default to
library version, since it is usually equally fast and result in
! shorter code. */
! if (!TARGET_INLINE_ALL_STRINGOPS && align < UNITS_PER_WORD)
return 0;
if (TARGET_SINGLE_STRINGOP)
--- 10778,10789 ----
/* In case we don't know anything about the alignment, default to
library version, since it is usually equally fast and result in
! shorter code.
!
! Also emit call when we know that the count is large and call overhead
! will not be important. */
! if (!TARGET_INLINE_ALL_STRINGOPS
! && (align < UNITS_PER_WORD || !TARGET_REP_MOVL_OPTIMAL))
return 0;
if (TARGET_SINGLE_STRINGOP)
Index: i386.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/i386/i386.h,v
retrieving revision 1.280.4.21
diff -c -3 -p -r1.280.4.21 i386.h
*** i386.h 27 Nov 2002 19:35:34 -0000 1.280.4.21
--- i386.h 30 Nov 2002 19:23:45 -0000
*************** extern int x86_prefetch_sse;
*** 281,286 ****
--- 281,287 ----
#define TARGET_PREFETCH_SSE (x86_prefetch_sse)
#define TARGET_SHIFT1 (x86_shift1 & CPUMASK)
#define TARGET_USE_FFREEP (x86_use_ffreep & CPUMASK)
+ #define TARGET_REP_MOVL_OPTIMAL (x86_rep_movl_optimal & CPUMASK)
#define TARGET_STACK_PROBE (target_flags & MASK_STACK_PROBE)